Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[23.2] Fix user login when duplicate UserRoleAssociation exists #17854

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Mar 27, 2024

In 3fd5b02 we changed the sqlalchemy query construct one_or_none method to use the core statement .scalar_one_or_none, which bypasses the identity map. Since there are a handful of legacy accounts with duplicate UserRoleAssociation rows on usegalaxy.{org,eu} the deduplication that occured via the entity map meant that no exception was raised, while for the core level construct this happened.

Fortunately it's easy enough to simply add a distinct statement to return to the old behavior.

Here's an ipython session to prove this:

In [4]: ura = sa_session.query(UserRoleAssociation).filter_by(user_id=1, role_id=1).all()

In [5]: ura
Out[5]: [<galaxy.model.UserRoleAssociation(1) at 0x165809190>]

In [6]: new_ura = UserRoleAssociation(sa_session.get(User, 1), sa_session.get(Role, 1))

In [7]: sa_session.add(new_ura)

In [8]: sa_session.flush()

In [9]:         role = (
   ...:             sa_session.query(Role)
   ...:             .filter(
   ...:                 and_(
   ...:                     UserRoleAssociation.table.c.user_id == 1,
   ...:                     Role.id == UserRoleAssociation.table.c.role_id,
   ...:                     Role.type == Role.types.PRIVATE,
   ...:                 )
   ...:             )
   ...:             .one_or_none()
   ...:         )

In [10]:         stmt = select(Role).where(
    ...:             and_(
    ...:                 UserRoleAssociation.user_id == 1,
    ...:                 Role.id == UserRoleAssociation.role_id,
    ...:                 Role.type == Role.types.PRIVATE,
    ...:             )
    ...:         )
    ...:         role = sa_session.execute(stmt).scalar_one_or_none()
---------------------------------------------------------------------------
MultipleResultsFound                      Traceback (most recent call last)
Cell In[10], line 8
      1 stmt = select(Role).where(
      2     and_(
      3         UserRoleAssociation.user_id == 1,
   (...)
      6     )
      7 )
----> 8 role = sa_session.execute(stmt).scalar_one_or_none()

File ~/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/engine/result.py:1225, in Result.scalar_one_or_none(self)
   1212 def scalar_one_or_none(self):
   1213     """Return exactly one scalar result or ``None``.
   1214
   1215     This is equivalent to calling :meth:`_engine.Result.scalars` and
   (...)
   1223
   1224     """
-> 1225     return self._only_one_row(
   1226         raise_for_second_row=True, raise_for_none=False, scalar=True
   1227     )

File ~/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/engine/result.py:614, in ResultInternal._only_one_row(self, raise_for_second_row, raise_for_none, scalar)
    612     if next_row is not _NO_ROW:
    613         self._soft_close(hard=True)
--> 614         raise exc.MultipleResultsFound(
    615             "Multiple rows were found when exactly one was required"
    616             if raise_for_none
    617             else "Multiple rows were found when one or none "
    618             "was required"
    619         )
    620 else:
    621     next_row = _NO_ROW

MultipleResultsFound: Multiple rows were found when one or none was required

In [11]:         stmt = select(Role).where(
    ...:             and_(
    ...:                 UserRoleAssociation.user_id == 1,
    ...:                 Role.id == UserRoleAssociation.role_id,
    ...:                 Role.type == Role.types.PRIVATE,
    ...:             )
    ...:         ).distinct()
    ...:         role = sa_session.execute(stmt).scalar_one_or_none()

I think that's what the sqlalchemy docs for Query.one_or_none mean to communicate with

Returns None if the query selects no rows. Raises sqlalchemy.orm.exc.MultipleResultsFound if multiple object identities are returned, or if multiple rows are returned for a query that returns only scalar values as opposed to full identity-mapped entities.

Fixes #17848

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

In galaxyproject@3fd5b02 we changed the sqlalchemy query construct `one_or_none`
method to use the core statement `.scalar_one_or_none`, which bypasses
the identity map. Since there are a handful of legacy accounts with
duplicate UserRoleAssociation rows on usegalaxy.{org,eu} the
deduplication that occured via the entity map meant that no exception
was raised, while for the core level construct this happened.

Fortunately it's easy enough to simply add a distinct
statement to return to the old behavior.

Here's an ipython session to prove this:

"""
In [4]: ura = sa_session.query(UserRoleAssociation).filter_by(user_id=1, role_id=1).all()

In [5]: ura
Out[5]: [<galaxy.model.UserRoleAssociation(1) at 0x165809190>]

In [6]: new_ura = UserRoleAssociation(sa_session.get(User, 1), sa_session.get(Role, 1))

In [7]: sa_session.add(new_ura)

In [8]: sa_session.flush()

In [9]:         role = (
   ...:             sa_session.query(Role)
   ...:             .filter(
   ...:                 and_(
   ...:                     UserRoleAssociation.table.c.user_id == 1,
   ...:                     Role.id == UserRoleAssociation.table.c.role_id,
   ...:                     Role.type == Role.types.PRIVATE,
   ...:                 )
   ...:             )
   ...:             .one_or_none()
   ...:         )

In [10]:         stmt = select(Role).where(
    ...:             and_(
    ...:                 UserRoleAssociation.user_id == 1,
    ...:                 Role.id == UserRoleAssociation.role_id,
    ...:                 Role.type == Role.types.PRIVATE,
    ...:             )
    ...:         )
    ...:         role = sa_session.execute(stmt).scalar_one_or_none()
---------------------------------------------------------------------------
MultipleResultsFound                      Traceback (most recent call last)
Cell In[10], line 8
      1 stmt = select(Role).where(
      2     and_(
      3         UserRoleAssociation.user_id == 1,
   (...)
      6     )
      7 )
----> 8 role = sa_session.execute(stmt).scalar_one_or_none()

File ~/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/engine/result.py:1225, in Result.scalar_one_or_none(self)
   1212 def scalar_one_or_none(self):
   1213     """Return exactly one scalar result or ``None``.
   1214
   1215     This is equivalent to calling :meth:`_engine.Result.scalars` and
   (...)
   1223
   1224     """
-> 1225     return self._only_one_row(
   1226         raise_for_second_row=True, raise_for_none=False, scalar=True
   1227     )

File ~/src/galaxy/.venv/lib/python3.11/site-packages/sqlalchemy/engine/result.py:614, in ResultInternal._only_one_row(self, raise_for_second_row, raise_for_none, scalar)
    612     if next_row is not _NO_ROW:
    613         self._soft_close(hard=True)
--> 614         raise exc.MultipleResultsFound(
    615             "Multiple rows were found when exactly one was required"
    616             if raise_for_none
    617             else "Multiple rows were found when one or none "
    618             "was required"
    619         )
    620 else:
    621     next_row = _NO_ROW

MultipleResultsFound: Multiple rows were found when one or none was required

In [11]:         stmt = select(Role).where(
    ...:             and_(
    ...:                 UserRoleAssociation.user_id == 1,
    ...:                 Role.id == UserRoleAssociation.role_id,
    ...:                 Role.type == Role.types.PRIVATE,
    ...:             )
    ...:         ).distinct()
    ...:         role = sa_session.execute(stmt).scalar_one_or_none()
"""

I think that's what the sqlalchemy docs for [Query.one_or_none](https://docs.sqlalchemy.org/en/14/orm/query.html#sqlalchemy.orm.Query.one_or_none) mean to
communicate with

> Returns None if the query selects no rows. Raises sqlalchemy.orm.exc.MultipleResultsFound if multiple object identities are returned, or if multiple rows are returned for a query that returns only scalar values as opposed to full identity-mapped entities.

Fixes galaxyproject#17848
@mvdbeek mvdbeek added kind/bug area/database Galaxy's database or data access layer labels Mar 27, 2024
@github-actions github-actions bot added this to the 23.2 milestone Mar 27, 2024
@mvdbeek mvdbeek requested review from jdavcs and bgruening and removed request for jdavcs March 27, 2024 18:25
@bgruening
Copy link
Member

Danke Marius!

@bgruening
Copy link
Member

Deployed and tested on EU. With the creds from MT I can now login. 🎉 So 1/3 of known users can now log in again.
I can not test the others but will ask them to do so.

Thanks a lot!

@bgruening
Copy link
Member

All users reported back that they can log in again 🎉

Thanks again!

@sanjaysrikakulam
Copy link
Contributor

Thank you! @mvdbeek

@mvdbeek mvdbeek merged commit ea3025d into galaxyproject:release_23.2 Mar 28, 2024
42 of 47 checks passed
@nsoranzo nsoranzo deleted the fix_role_query branch March 28, 2024 11:40
@jdavcs
Copy link
Member

jdavcs commented Mar 29, 2024

Thank you for debugging+fixing this, @mvdbeek!

For more context, it's the difference between session.query(mapped-object) vs. session.execute(sql-statement): the latter (as per Marius) does not touch the identity map. Here's a detailed explanation from SA docs on why the deduplication happens for session.query:

As the Session makes use of an identity map, even though our SQL result set has two rows with primary key 5, there is only one User(id=5) object inside the Session which must be maintained uniquely on its identity, that is, its primary key / class combination. It doesn’t actually make much sense, if one is querying for User() objects, to get the same object multiple times in the list. (Ref.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Galaxy's database or data access layer kind/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants