Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[23.2] Anticipate PendingRollbackError in check_database_connection #17598

Merged

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Mar 5, 2024

Should be a straightforward way to fix:

PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.OperationalError) server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.

[SQL: UPDATE dataset SET update_time=%(update_time)s, object_store_id=%(object_store_id)s WHERE dataset.id = %(dataset_id)s]
[parameters: ({'update_time': datetime.datetime(2024, 3, 4, 12, 27, 29, 277355), 'object_store_id': 'corral4', 'dataset_id': 103938181}, {'update_time': datetime.datetime(2024, 3, 4, 12, 27, 29, 277359), 'object_store_id': 'corral4', 'dataset_id': 103938182}, {'update_time': datetime.datetime(2024, 3, 4, 12, 27, 29, 277361), 'object_store_id': 'corral4', 'dataset_id': 103938183})]
(Background on this error at: https://sqlalche.me/e/14/e3q8) (Background on this error at: https://sqlalche.me/e/14/7s2a)
  File "galaxy/jobs/handler.py", line 376, in __monitor
    self.__monitor_step()
  File "galaxy/jobs/handler.py", line 394, in __monitor_step
    self.__handle_waiting_jobs()
  File "galaxy/jobs/handler.py", line 406, in __handle_waiting_jobs
    check_database_connection(self.sa_session)
  File "galaxy/model/base.py", line 69, in check_database_connection
    if session and session.connection().invalidated:
  File "<string>", line 2, in connection
  File "sqlalchemy/orm/session.py", line 1545, in connection
    return self._connection_for_bind(
  File "sqlalchemy/orm/session.py", line 1555, in _connection_for_bind
    return self._transaction._connection_for_bind(
  File "sqlalchemy/orm/session.py", line 724, in _connection_for_bind
    self._assert_active()
  File "sqlalchemy/orm/session.py", line 604, in _assert_active
    raise sa_exc.PendingRollbackError(

from https://sentry.galaxyproject.org/share/issue/5c54d2c8fb474fe7b38614dac773550e/

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@mvdbeek mvdbeek changed the base branch from dev to release_23.2 March 5, 2024 10:15
@github-actions github-actions bot added the area/database Galaxy's database or data access layer label Mar 5, 2024
@github-actions github-actions bot added this to the 24.1 milestone Mar 5, 2024
@mvdbeek mvdbeek changed the title Anticipate PendingRollbackError in check_database_connection [23.2] Anticipate PendingRollbackError in check_database_connection Mar 5, 2024
@github-actions github-actions bot modified the milestones: 24.1, 23.2 Mar 5, 2024
@jdavcs jdavcs self-requested a review March 5, 2024 13:45
@jdavcs
Copy link
Member

jdavcs commented Mar 5, 2024

How about this:

def check_connection(session):
    assert session
    if not session.get_transaction().is_active or session.connection().invalidated:
        session.rollback()
        log.error("Database transaction rolled back due to invalid state.")

Here's the gist of it. There is a very specific state in which this particular error will be raised: the session's transaction's internal state should be DEACTIVE AND there should have been at least one error raised during a previous rollback (on this OR a nested transaction). Then this error will be raised as a result of one of several method calls inside the transaction class (that's all internal SA code), and one of them is Session.connection() - that's what triggers the error on line 69. However, instead of checking for the above conditions, we can simply check SessionTransaction.is_active. The code above handles both bases. The call to rollback will not re-raise the pre-existing rollback exception.

@mvdbeek
Copy link
Member Author

mvdbeek commented Mar 5, 2024

This does seem a bit more reliant on SQLAlchemy internal state but that's fine either way. Can you push that to my PR?

@jdavcs jdavcs force-pushed the fix_pending_rollback_error branch from 6a35256 to fb89a26 Compare March 5, 2024 21:01
@jdavcs
Copy link
Member

jdavcs commented Mar 5, 2024

whoops. I see it. Fixing.

@mvdbeek mvdbeek merged commit 6728e48 into galaxyproject:release_23.2 Mar 6, 2024
42 of 45 checks passed
@nsoranzo nsoranzo deleted the fix_pending_rollback_error branch April 5, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Galaxy's database or data access layer kind/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants