Skip to content

Commit

Permalink
Merge pull request #3 from justinlittman/lock
Browse files Browse the repository at this point in the history
Clarification on locking.
  • Loading branch information
justinlittman authored Mar 20, 2019
2 parents 5fe2d0e + 66d9baa commit c07bba0
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 2 deletions.
7 changes: 6 additions & 1 deletion docs/administration.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,12 @@
To prevent multiple harvests being performed concurrently for a collection, a lock file (`lock.json`) is written to a
collection's base directory during a harvest. Harvesters check to see if the lock file is present before beginning.

If a harvest fails, it is possible that the lock file is not removed. To force it to be removed, you can delete `lock.json`
If a harvest raises a `LockedException` this indicates that a harvest is currently in process or a previous harvest
exited uncleanly.

If a collection is locked because multiple harvests are attempting to run concurrently then adjust the schedule.

If a collection is locked because a previous harvest exited uncleanly, then force it be unlocked. To unlock, delete `lock.json`
or execute `tweet_harvester`'s `aws unlock` command. For example:

$ python3 tweet_harvester.py aws unlock twarc_cloud test_collection
Expand Down
4 changes: 3 additions & 1 deletion twarccloud/harvester/collection_lock.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,9 @@ def __exit__(self, *args):


class LockedException(Exception):
pass
def __init__(self):
Exception.__init__(self, 'Collection is locked. This is because a harvest is currently running or a harvest ' \
'terminated uncleanly.')


# Returns True if lock file exists at the provided filepath.
Expand Down

0 comments on commit c07bba0

Please sign in to comment.