Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/beacon network #93

Merged
merged 43 commits into from
Oct 3, 2024
Merged

Features/beacon network #93

merged 43 commits into from
Oct 3, 2024

Conversation

gsfk
Copy link
Member

@gsfk gsfk commented Jun 11, 2024

First pass at code for networking Bento beacons together.

  • On startup, contacts all beacons in a list of urls passed in from a config file.
  • routes requests/responses for any beacon in the network.

Aggregation is done in the client rather than here. This allows for real-time updates in the frontend, rather than waiting for the slowest beacon to respond. However, aggregation is easy and should come in a later version, since it lets us:

  • treat the network like a single beacon
  • add the network to an existing network
  • search the network from a non-Bento client that doesn't know how to aggregate the data

Some features are still experimental:

  • uses bento_public search config for filtering terms discovery (eg sex=MALE or BMI<18) since that's what our beacons currently use. Once changes are rolled out to all beacons, we can use beacon spec filtering terms instead.
  • The network itself is hosted on a particular beacon instance, which will be expected (but not required) to be part of the network, so there is some extra handling to avoid non-terminating circular http requests. Whether we are better off hosting the network separately is left as an exercise for the reviewer.

Related PRS:

Todos for next version:

  • init network at first network request rather than at app startup
  • make network init calls async
  • back-end aggregation (see above)

@gsfk gsfk marked this pull request as ready for review August 15, 2024 19:38
@gsfk gsfk mentioned this pull request Aug 15, 2024
4 tasks
Copy link
Member

@davidlougheed davidlougheed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass

@@ -2,6 +2,9 @@


def auth_header_getter(r: Request) -> dict[str, str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type hint seems wrong here if request can be false-y - should be Request | None?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is fiddly.. r is not actually a request, it's a werkzeug LocalProxy that acts like a request when a request happens to present. When no request is present it's just LocalProxy rather than none.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm...

Copy link
Member Author

@gsfk gsfk Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering just removing this and handling the no-request edge case elsewhere. Seems inelegant to ask for headers when there's no request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

bento_beacon/network/bento_public_query.py Outdated Show resolved Hide resolved
bento_beacon/config_files/config.py Outdated Show resolved Hide resolved
bento_beacon/network/bento_public_query.py Outdated Show resolved Hide resolved
bento_beacon/network/bento_public_query.py Outdated Show resolved Hide resolved
from ..endpoints.variants import get_variants
from .bento_public_query import fields_intersection, fields_union

PUBLIC_SEARCH_FIELDS_PATH = "/api/metadata/api/public_search_fields"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be better to accept the katsu URL from a configuration variable and then build from there (so just katsu URL + /api/public_search_fields)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, hmm. These are calls to other katsus in the network:

  1. PUBLIC_SEARCH_FIELDS_PATH probably belongs in config rather than hidden in this file. I probably left it here because this entire file will be gone in the next version.
  2. beacon network is configured by passing in beacon base urls. We find the katsu search fields by figuring out the base bento url (ie remove /api/beacon) and adding on PUBLIC_SEARCH_FIELDS_PATH. So, you have your choice of which is least inelegant:
      - url surgery as above
      - configure network by using bento base urls instead of beacon urls
      - configure network by passing both beacon and katsu urls

In the next iteration this will all be replaced by calls to network beacons only, so this sort of url modification will no longer be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's temporary it's fine to leave in, but in general I think not relying on Bento's particular URL setup at time of code-writing (so configuring Katsu URL separately) is the best long-term approach if any contact is needed.

bento_beacon/network/utils.py Show resolved Hide resolved
return fields


def public_search_fields_url(beacon_url):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't rely on katsu being under the portal. subdomain forever in bento. instead, take the Katsu URL as an env var (similar to aggregation or WES) and build it from Katsu + whatever subpath is required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

Copy link
Member

@davidlougheed davidlougheed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from a quick scan of the code. i should also try this out... is there an easy way I can set this up locally?

def get_katsu_config_search_fields():
# Use forwarded auth for getting available search fields, which may be limited based on access level
fields = katsu_get(current_app.config["KATSU_PUBLIC_CONFIG_ENDPOINT"], requires_auth="forwarded")
def get_katsu_config_search_fields(requires_auth):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to type hint this argument the same as the other one - you could make a type RequiresAuthOptions = Literal["none", "forwarded", "full"] and use that as the type hint both here and above.

@@ -1,5 +1,6 @@
import json
import os
from ..constants import GRANULARITY_COUNT, GRANULARITY_RECORD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's nice to have local imports below module imports (urllib3 here)

@gsfk
Copy link
Member Author

gsfk commented Oct 3, 2024

for local test, use Bento v17 branch and see setup instructions here.

You'll probably need to change bento_public version to pr-165

Edit: also requires this patch if not already merged

Copy link
Member

@davidlougheed davidlougheed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested with the default config for now - seems to work well.

@gsfk gsfk merged commit d278832 into master Oct 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants