-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate if jobs can enter monitoring while in submitting stage #2121
Comments
@abhijeetsharma200 See further information here At the moment the behaviour around submission and monitoring is the following
I think we want a few changes in behaviour.
|
I think the first step will be to make a set of tests where you can get subjobs to fail on command and can get subjobs to submit very slowly as a way of testing if monitoring is starting at the same time. The TestSubmitter is a dummy backend that can be used for this. |
When a master job is submitting, it can take a very long time to submit the subjobs for certain remote backends (i.e. several hours if there are maybe 3000 subjobs). At the moment, the subjobs are not monitored during this period, so if some have finished already, we are effectively having deadtime in the system. Another benefit will be that if a job submission is terminated by the Ganga process getting killed, at least the already submitted subjobs will be recoverable. The current policy of failed submissions reverting the job to the new status should probably be changed to make this work.
The text was updated successfully, but these errors were encountered: