User jobs failing with 'Maximum of reschedulings reached' #7429
-
Hi, Hope you can shed some light on this problem that might help me find the solution. DIRAC py3 server 7.3.17, py3 pilot jobs and py3 diracos2 client Many user jobs are failing with the following MinorStatus: The job logging shows the following Application Status: In the WorkloadManagement/Optimizers/log/current I see the following error message (that may be related) Job #348732 that is being referred to is one of those failed jobs. Is it maybe immediately clear to you what may cause this? I understand you might need more information. Thanks, Ernst |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Just noticed there's another discussion on the same topic. Sorry. |
Beta Was this translation helpful? Give feedback.
The error is pretty clear: it is trying to insert the "same" job in the DB more than once. I can't really know how it happened without looking deeply at your logs, which of course I can't!
BTW, you are using an old version of DIRAC, we really suggest you update.