Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve OpenSearch permission logic #940

Merged
merged 29 commits into from
Dec 18, 2024

Conversation

naknomum
Copy link
Member

@naknomum naknomum commented Dec 12, 2024

overhaul permission logic for OpenSearch indexing

PR fixes #785
PR fixes #779

Changes

Conceptual overview

Previously, an Encounter was indexed in a single pass, meaning all the desired searchable properties were added to the OpenSearch document in addition to a field used to determine permissions on the encounter. This permissions field (viewUsers) contained a list of all users who could see the encounter. This proved too intensive to compute on every indexing update for each encounter when number of users were large (> 1000). Computing this field for a single encounter was taking well over a minute on sharkbook (10k+ users).

This PR moves the computing of the viewUsers field to a separate background process which can pre-load all the complexities (all users, their collaborations) once and then iterate over all Encounters and set their permissions via an update (changing a single field) on the OpenSearch document. This entire process for 10k+ users and 100k+ encounters takes less than 30 seconds on sharkbook data (on QA); so it is safe to call it a huge speed improvement.

The trade off is that permissions on search will lag slightly from their reality. In particular, collaboration-based permissions lag until this permissions background indexing happens. Owner-based and publicness do not lag, as they are no longer needed to be part of this computation. (see below)

Detailed changes

  1. Previously, OpenSearch used the viewUsers fields as the only permissions filter when a search was done via the api. Put another way, this field contained all users who could read this encounter -- meaning a "public" encounter had a list of all user ids in this field. This has now changed: [see OpenSearch.querySanitize()] permissions during search queries is now matched based on three fields: (a) publiclyReadable - a new field which is set during normal indexing based on existing (complicated) rules of what is public; (b) submitterUserId - a new field which is simply the (uu)id of the submitterID user; (c) viewUsers - which now contains only users in collaboration with the submitting user, computed based on existing complex collaboration algorithms, which appears to respect "virtual" collaborations via organizations, etc.
  2. Search returns results based on the following: (a) querying user is admin, can search all; (b) publiclyReadable encounters will be returned to anyone; (c) encounters will match if querying user matches submitterUserId or an id in viewUsers (in collab with submitter)
  3. Permissions can be computed during normal indexing of an encounter by optionally setting enc.setOpensearchProcessPermissions(true) prior to indexing. Default is set false.

Permissions background runs

  • Two settings control when this runs: BACKGROUND_PERMISSIONS_MINUTES and BACKGROUND_PERMISSIONS_MAX_FORCE_MINUTES. The first is how often to check if permissions need to be run (see below) and the second is how often it will run, regardless of being "needed".
  • A system-wide boolean is available to mean "please run permissions indexing on next check". It can be set via OpenSearch.setPermissionsNeeded(). The idea being that when a permission-altering event (e.g. creation of a Collaboration) happens, we add this code to trigger indexing on the next run. This flag-setting has been done in a couple places, however, the reality of how fast the process seems to indicate we can likely just set the max-force value very low and have permission be forced to run often just to be safe. Thus, some observation and tweaking of these parameters will be ongoing.

@naknomum naknomum requested a review from holmbergius December 18, 2024 17:28
@naknomum naknomum marked this pull request as ready for review December 18, 2024 17:28
Copy link
Contributor

@holmbergius holmbergius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @naknomum for the changes.

@holmbergius holmbergius merged commit e971846 into main Dec 18, 2024
1 check passed
@holmbergius holmbergius deleted the 785_improve_search_permission_logic branch December 18, 2024 22:45
holmbergius added a commit that referenced this pull request Dec 19, 2024
…logic

improve OpenSearch permission logic
# Conflicts:
#	src/main/java/org/ecocean/Encounter.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants