You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kubernetes doesn't set a runtime limit on cronjobs by default, which means that "stuck" cronjobs can run forever.
From an administrator's perspective, there is no point in pods sitting around doing nothing.
From a user's perspective, the stuck pods block any further executions of the cronjob. So it will appear that the cronjob has just stopped running:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal JobAlreadyActive 45m (x28 over 28h) cronjob-controller Not starting job because prior execution is running and concurrency policy is Forbid
IMO Lagoon should set some reasonable time limit on cronjobs so that stuck pods don't sit around forever. This can be done by adding activeDeadlineSeconds to the Job template (docs).
What that reasonable limit is, is up for debate but I'd say something like 2h-4h would be reasonable? This default limit would also need to be added to Lagoon docs.
The text was updated successfully, but these errors were encountered:
Some users have long running cronjobs on purpose (dumb, yes). Whatever solution is implemented needs to be able to have the time adjustable with a sane default and guardrails in the event that implementing a 2-4h deadline impacts users negatively.
Right now what happens is that platform engineers manually kill cronjobs running longer than 24h so all that a hard-coded limit is doing will be encoding the existing platform behaviour.
Kubernetes doesn't set a runtime limit on cronjobs by default, which means that "stuck" cronjobs can run forever.
IMO Lagoon should set some reasonable time limit on cronjobs so that stuck pods don't sit around forever. This can be done by adding
activeDeadlineSeconds
to theJob
template (docs).What that reasonable limit is, is up for debate but I'd say something like 2h-4h would be reasonable? This default limit would also need to be added to Lagoon docs.
The text was updated successfully, but these errors were encountered: