Use Slurm array jobs to limit concurrent extraction jobs #335
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When reprocessing a whole bunch of runs at once, submit them as a Slurm array job, and ask Slurm to limit how many will run at once. This should mitigate the 'database locked' errors, without requiring people to manually batch their jobs and monitor their progress.
The limit is set at 30 concurrent jobs for now. Obviously we can make that configurable.
I've taken the shortcut of using the array task IDs for the run numbers. This avoids having to define a way to communicate 'array task 1 -> run 53', but it does mean that only the run numbers can vary within an array. So the one job we ask to update the variables table in the DB is submitted separately, and if you do
reprocess all
on a database spanning multiple proposals, each proposal would be submitted separately, raising the effective limit to N * 30 jobs. That's not ideal, but so far it's largely theoretical.Closes #96.