-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WJ-1176] Job queue #1671
Merged
Merged
[WJ-1176] Job queue #1671
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## develop #1671 +/- ##
===========================================
- Coverage 40.45% 40.25% -0.20%
===========================================
Files 341 342 +1
Lines 10744 10798 +54
===========================================
+ Hits 4346 4347 +1
- Misses 6398 6451 +53
*This pull request uses carry forward flags. Click here to find out more.
|
emmiegit
force-pushed
the
WJ-1176-job-queue
branch
from
October 30, 2023 15:18
a295b9f
to
b46d52a
Compare
Is necessary because a worker may reboot, or multiple workers may start and then all but the first fail to create the queue.
Now using the actual queue!
In some circumstances we really do not have the site ID and should not be requiring it to be filtered there (that's what the regular get() methods are for), so instead we should pass in only the ID to be fetched, with a check for deleted entities in case we only want extant ones.
This has an option for getting deleted pages, and separates out the notion of fetching live pages by ID for internal processes.
With the change to PageService::get_direct(), we do not need the site_id to do outdating, and this isn't correct anyways since for some cases it assumes any page connections are on the same site, which is wrong.
emmiegit
force-pushed
the
WJ-1176-job-queue
branch
from
October 30, 2023 15:28
b46d52a
to
73c42e8
Compare
This way we don't get into loops where a job fails, but not before it adds another job on the queue, leading to a build-up.
rsmq enforces a maximum which we're apparently surpassing, we should catch this at Config parsing/creation time.
Yossipossi1
approved these changes
Oct 31, 2023
Zokhoi
approved these changes
Oct 31, 2023
thanks @Zokhoi @Yossipossi1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ground work for a persistent job queue was set up in #1668.
In this PR, I use the added
rsmq_async
crate to persist jobs to the queue, and receive jobs from the queue in a job worker (configurable quantity), before deleting them post-success, to enable failure retries. The previous job queue implementation, an in-memory channel, was removed. This simplifiesServerState
setup a bit.Additionally, to implement recurring jobs such as the periodic pruning work, this system permits jobs to submit a "follow-up job", which is added to the queue as part of the job's work. Combined with a delay, this allows things such as "run this operation every six hours".
I've added configuration fields for all these various durations and values.
I also change the concept of "directly fetching pages" to not require a site ID, and have an option for more clear fetching of deleted pages or not. This also updates the
OutdateService
, which had a bug in it regarding use of an incorrectsite_id
on pages.NOTE: There is a known issue where the value for "one day" (86400 seconds) gets multiplied by 1000 and overflows rsmq's limit. I will investigate further and file a bug in the upstream, but for now this PR is ready.