Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Collection] IRI Generation #368

Closed
Freymaurer opened this issue Jun 6, 2024 · 9 comments · Fixed by #381
Closed

[Collection] IRI Generation #368

Freymaurer opened this issue Jun 6, 2024 · 9 comments · Fixed by #381
Assignees
Labels
Type: Bug Something is not working, and it is confirmed by maintainers to be a bug.

Comments

@Freymaurer
Copy link
Collaborator

Freymaurer commented Jun 6, 2024

I think we should finally create a unified logic for uri generation.

The current logic, shown here might not be sufficient, to handle all different kinds of url.

Below i will try to summarize the requirements:

  • DPBO: http://purl.org/nfdi4plants/ontology/dpbo/DPBO_0002006 (TS4TIB-Service)
  • EFO: http://www.ebi.ac.uk/efo/EFO_0006571

TS4TIB

An external ontology service in cooperation with DataPLANT:

quoting @Hannah-Doerpholz

So, I have created all purls for the terms that are currently in DPBO. The ontology repo now also has an automated workflow > that creates new purls whenever new DPBO terms are added to the .obo file. The purl checker + creation runs every Saturday > once per week, since it takes a while to run.

All ontologies are included except:

  • ARC
  • MIAPPE (our homebrew version)
  • CREDiT
  • NCBITaxon (both the full one as well as our homebrew one)

Everything else that we currently import through the ext_ontologies.include, as well as our DBPO is in the TIB.

I will close all related issues to track them here. @Hannah-Doerpholz please verify if this issue roughly sums up the requirements 🙂

@github-actions github-actions bot added the Status: Needs Triage This item is up for investigation. label Jun 6, 2024
@Freymaurer Freymaurer added Type: Bug Something is not working, and it is confirmed by maintainers to be a bug. and removed Status: Needs Triage This item is up for investigation. labels Jun 6, 2024
@Freymaurer Freymaurer moved this to In discussion in ARCStack Jun 6, 2024
@Hannah-Doerpholz
Copy link

The summary looks good to me! I'll see if we can't also add ARC, our MIAPPE and CREDiT into the TS4TIB. That would take a while though. I'll update you on any changes

@Freymaurer
Copy link
Collaborator Author

@Hannah-Doerpholz I just remembered, that MS term urls are also broken:

Example: https://ontobee.org/ontology/MS?iri=http://purl.obolibrary.org/obo/MS_1000031

Do we have any replacement for this?

@HLWeil
Copy link
Member

HLWeil commented Jun 12, 2024

I will implement a hardcoded logic to cover the cases where we know that the standard PURL is wrong and we have a functioning alternative.

@Hannah-Doerpholz
Copy link

@HLWeil Thank you!
@Freymaurer I know they are broken, but that is not something we can resolve. I contacted the MS maintainers, OBO Foundry and Ontobee, since I didn't know where exactly the issue is. OBO Foundry says that the purls are fine and that the problem is likely with Ontobee, Ontobee says that the issue is probably with the ms.owl file, and the MS maintainers haven't responded at all.

A workaround for MS would be to not rely on the purls but link directly to OLS4, since the terms are displayed correctly there. That would mean the following:

old MS link: http://purl.obolibrary.org/obo/MS_1002809
new MS link: https://www.ebi.ac.uk/ols4/ontologies/ms/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMS_1002809

@Hannah-Doerpholz
Copy link

Another option might be to go through Bioregistry. Bioregistry is an identifier resolver. Here, there is always some RDF information about how the URI format for a term should look like. For example in ENVO:

http://purl.obolibrary.org/obo/ENVO_$1

Our imported ontologies that are in Bioregistry:
ENVO, PSI-MS (Prefix MS), CHEBI, GO, OBI, PATO, PECO, PO (purls broken), RO (purls broken), TO, UO, PSI-MOD (prefix MOD), EFO, NCIT, OMP

Ontologies we host on GitHub ourselves that are in Bioregistry:
CRO (the credit ontology), NCBITaxon

Ontologies we host on GitHub that are NOT in Bioregistry (I could add them though):
DPBO
ARC_v3.0
MIAPPE

For PO and RO, the workarounds as MS could be:

old PO link: http://purl.obolibrary.org/obo/PO_0007033
new PO link: https://www.ebi.ac.uk/ols4/ontologies/po/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FPO_0007033

old RO link: http://purl.obolibrary.org/obo/RO_0002533
new RO link: https://www.ebi.ac.uk/ols4/ontologies/ro/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FRO_0002533?lang=en

@HLWeil
Copy link
Member

HLWeil commented Jun 13, 2024

Thanks a lot for your thorough input, @Hannah-Doerpholz!

So maybe we could use bioregistry in general?

E.g. instead of
http://purl.obolibrary.org/obo/ENVO_09200010
use
https://bioregistry.io/envo:09200010

And instead of
http://www.ebi.ac.uk/efo/EFO_0005147
use
https://bioregistry.io/efo:0005147

Would be kind of a practical unification.


Edit: Doesn't work for po though, as they link to PURL even though they have the correct, direct link to ols4 also listed...
image

@HLWeil
Copy link
Member

HLWeil commented Jun 13, 2024

@Hannah-Doerpholz, I opened a PR with the changes, this took a quite a bit longer as other tests still used hard-coded "deprecated" URLs.

With this change, the namespaces discussed here should result in working URLs, but working towards unification would still be highly welcome. Especially the ms-style URLS are clunky and harder to parse.

If parsing demands change (e.g. if ontologies are added to bioregistry), feel free to reopen this issue and add the requirements.

@github-project-automation github-project-automation bot moved this from In discussion to Done in ARCStack Jun 13, 2024
@Hannah-Doerpholz
Copy link

Sorry about another question, but I noticed that in Swate the links are not adjusted. For example, "mass spectrometry" from MS the link is still a purl.obolibrary link (bottom left):
Screenshot from 2024-06-17 15-35-20
The same also goes for DPBO. Is this currently the expected behaviour?

@Freymaurer
Copy link
Collaborator Author

Yes! We are working on the update for Swate with ARCtrl 2.0.0 integration. The expected release date (if not required earlier) is the 27.06.2024.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something is not working, and it is confirmed by maintainers to be a bug.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants