Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove rollback from
__check_jobs_at_startup
I think the premise that we want to do a rollback on exceptions in this method is wrong (it **may** be correct apprach in other places in the codebase e.g. in `Tool.handle_single_execution()`). Here it prevents us from comitting anything inside the with statement (as the job_wrapper.fail method does). Here's the simplified issue: ```shell ❯ python -i scripts/db_shell.py -c config/galaxy.yml >>> with sa_session() as session, session.begin(): ... sa_session.execute(update(Job).where(Job.id == 1).values(state="error")) ... sa_session.commit() ... sa_session.execute(update(Job).where(Job.id == 1).values(state="ok")) ... sa_session.commit() ... <sqlalchemy.engine.cursor.LegacyCursorResult object at 0x11f1be350> Traceback (most recent call last): File "<stdin>", line 4, in <module> File "<string>", line 2, in execute File "/Users/mvandenb/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1711, in execute conn = self._connection_for_bind(bind, close_with_result=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/mvandenb/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind TransactionalContext._trans_ctx_check(self) File "/Users/mvandenb/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/engine/util.py", line 199, in _trans_ctx_check raise exc.InvalidRequestError( sqlalchemy.exc.InvalidRequestError: Can't operate on closed transaction inside context manager. Please complete the context manager before emitting further commands. ``` It is probably still worthwhile to have the job recovery be minimal and do things such as calling the job wrapper fail method that does actual work to the job handler as in galaxyproject#17083, but that's refactoring that can be done on the dev branch and it still seems risky in the sense that we then need to be very careful in ensuring we don't commit anywhere else inside the scope of the begin() statement. Finally I don't think it makes sense that the startup check should ever cause the boot process to fail. This isn't a misconfiguration or even anything catastrophic for the remaining jobs and places unnecessary stress on admins and can basically break at any time and shouldn't cause a complete service failure. Fixes galaxyproject#17079
- Loading branch information