Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ambiguous redirects from IMG to NMDC for data download #1418

Open
aclum opened this issue Oct 12, 2024 · 0 comments
Open

ambiguous redirects from IMG to NMDC for data download #1418

aclum opened this issue Oct 12, 2024 · 0 comments

Comments

@aclum
Copy link
Contributor

aclum commented Oct 12, 2024

If data in IMG originates from NMDC their download button redirects to our data portal but only to the main page. Ideally a redirect would be filtered by the IMG taxon oid. This is workflow analysis identifier than for legacy reasons currently is populated at the biosample level.

current behavior: https://img.jgi.doe.gov/cgi-bin/m/main.cgi?section=TaxonDetail&page=taxonDetail&taxon_oid=3300070481 'Download Data' button redirects to https://data.microbiomedata.org/

desired behavior: a url which we could provide for IMG which would resolve to a filter criteria of biosample_set filter on img_identifiers=$IMG_TAXON_OID

switching example IDs because we haven't back populated the records with these identifiers yet.

for taxon oid 3A3300046691, biosample nmdc:bsm-11-011z7z70
Example filter in mongo
curl -X 'GET' \

  'https://api.microbiomedata.org/nmdcschema/biosample_set?filter=%7B%22img_identifiers%22%3A%22img.taxon%3A3300046691%22%7D&max_page_size=20' \
  -H 'accept: application/json'

Get the biosample ID from the response and then use that for a faceted search filter ending up with https://data.microbiomedata.org/?q=ChwQABgEIhYibm1kYzpic20tMTEtMDExejd6NzAi

currently img identifiers can be in img_identifiers or alternative_identifiers on class Biosample. Berkeley schema add this slot to some WorkflowExecution subclasses but no plans currently to move the data to the right location.

IMG also stores our DataGeneration IDs and workflow execution IDs which is probably better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant