Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add taxslim-disjoint-over-in-taxon.owl to the pipeline #2928

Closed
wants to merge 11 commits into from

Conversation

anitacaron
Copy link
Collaborator

@anitacaron anitacaron commented Jun 19, 2023

Fixes #2707
Fixes #2613

@anitacaron anitacaron requested a review from balhoff June 19, 2023 12:36
@anitacaron anitacaron self-assigned this Jun 19, 2023
@anitacaron
Copy link
Collaborator Author

anitacaron commented Jun 19, 2023

EDIT: Unsats solved removing the axiom wing never_in_taxon Sauropsida

Found unsats in the pipeline.

Robot explain result

wing feather SubClassOf Nothing

tertial remex feather SubClassOf Nothing

secondary remex feather SubClassOf Nothing

primary remex feather SubClassOf Nothing

remex feather SubClassOf Nothing

Axiom Impact

Axioms used 5 times

Axioms used 4 times

Axioms used 1 times

Ontologies used:

balhoff
balhoff previously approved these changes Jun 19, 2023
Copy link
Member

@balhoff balhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks good.

@balhoff
Copy link
Member

balhoff commented Jun 19, 2023

So glad to see the enhanced taxon constraint checks are working! These problems seem to be due to:

That statement is wrong; Sauropsida includes birds.

@anitacaron
Copy link
Collaborator Author

New unsats when merging uberon with taxonomy equivalences. This includes the expansion on present_in_taxon annotation.

Robot explain results

FMA_85540 SubClassOf Nothing

FMA_19088 SubClassOf Nothing

FMA_19092 SubClassOf Nothing

MA_0001471 SubClassOf Nothing

HBA_4413 SubClassOf Nothing

FMA_20229 SubClassOf Nothing

FMA_16210 SubClassOf Nothing

FMA_54756 SubClassOf Nothing

FMA_59819 SubClassOf Nothing

EHDAA2_0001973 SubClassOf Nothing

FMA_54687 SubClassOf Nothing

TAO_0005255 SubClassOf Nothing

EHDAA2_0003239 SubClassOf Nothing

EMAPA_36617 SubClassOf Nothing

FMA_83040 SubClassOf Nothing

TAO_0001738 SubClassOf Nothing

EHDAA2_0000259 SubClassOf Nothing

EHDAA2_0001976 SubClassOf Nothing

EHDAA2_0001462 SubClassOf Nothing

FMA_19090 SubClassOf Nothing

Axiom Impact

Axioms used 20 times

Axioms used 17 times

Axioms used 15 times

Axioms used 12 times

Axioms used 10 times

Axioms used 9 times

Axioms used 8 times

Axioms used 7 times

Axioms used 6 times

Axioms used 5 times

Axioms used 4 times

Axioms used 3 times

Axioms used 2 times

Axioms used 1 times

@anitacaron
Copy link
Collaborator Author

reports/taxon-constraint-check.txt content

UNSAT: ZFA:0005536 ZFA:0005536
UNSAT: FMA:54687 FMA:54687
UNSAT: ZFA:0005537 ZFA:0005537
UNSAT: TAO:0001738 TAO:0001738
UNSAT: c1d7e59879f80e2a1a6e98ac659b47d9-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0011532 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: EHDAA2:0000259 EHDAA2:0000259
UNSAT: FMA:20229 FMA:20229
UNSAT: FMA:19088 FMA:19088
UNSAT: e2da7a06466e71bdee4f659f2b7f3658-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0014850 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: aabc3239c18a36ded84a678b89a935b6-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/CL_0000525 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: a72db1d585113a0f45aa435db9af7978-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/CL_2000062 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: 16f4bce63046a0e7cf14833c53aef5fc-ec28385bb11de0acc278b537c5e1c1d5 'http://purl.obolibrary.org/obo/UBERON_0011094 in taxon http://purl.obolibrary.org/obo/NCBITaxon_7764'
UNSAT: FMA:83040 FMA:83040
UNSAT: TAO:0005255 TAO:0005255
UNSAT: FMA:59819 FMA:59819
UNSAT: FMA:85540 FMA:85540
UNSAT: c5220085f0640e5955e39f2f743df122-51e80262c3dfb609db0802da540b0449 'http://purl.obolibrary.org/obo/UBERON_0001074 in taxon http://purl.obolibrary.org/obo/NCBITaxon_7718'
UNSAT: FMA:19092 FMA:19092
UNSAT: FMA:19090 FMA:19090
UNSAT: FMA:54756 FMA:54756
UNSAT: HBA:4413 HBA:4413
UNSAT: MA:0001471 MA:0001471
UNSAT: EHDAA2:0001973 EHDAA2:0001973
UNSAT: EMAPA:36617 EMAPA:36617
UNSAT: 595ea229938610d4a226aacaf0654449-48afe725db6e2f9c8d251393e707d06e 'http://purl.obolibrary.org/obo/UBERON_0008258 in taxon http://purl.obolibrary.org/obo/NCBITaxon_6073'
UNSAT: ZFA:0005608 ZFA:0005608
UNSAT: EHDAA2:0003239 EHDAA2:0003239
UNSAT: bd0266432f93283879782192b3bf3dd4-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0001350 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: EHDAA2:0001976 EHDAA2:0001976
UNSAT: 285d4803f7b54839300fb8f489288fbb-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0011528 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: FMA:16210 FMA:16210
UNSAT: 41af0ecb50dc8394a10ccbeeda6418f0-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0011511 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: EHDAA2:0001462 EHDAA2:0001462
UNSAT: 1e0e1fc224306e088b9c9b3b60730554-b3ac123c1b5a9db5375c10d2d6f1b8cf 'http://purl.obolibrary.org/obo/UBERON_0036215 in taxon http://purl.obolibrary.org/obo/NCBITaxon_4751'
UNSAT: b2366b8acc1f5595592b82ba4e43ceac-7ea78d70f8067136834749f3bd5885c2 'http://purl.obolibrary.org/obo/UBERON_0001250 in taxon http://purl.obolibrary.org/obo/NCBITaxon_7955'
NUMBER_OF_UNSATISFIABLE_CLASSES: 35
2023-06-20 12:14:15,675 ERROR (CommandRunner:2938) Ontology has unsat classes - will not proceed

@cmungall
Copy link
Member

can you make explanations for the remaining violations?

My hunch is that first ZFA fontanel one is to do with sutures

@anitacaron
Copy link
Collaborator Author

The explanations are here

@anitacaron
Copy link
Collaborator Author

anitacaron commented Jul 20, 2023

I also changed in my brach the fix for ZFA fontanel and now we have 30 unsats

Robot explanation

FMA_20229 SubClassOf Nothing

MA_0001471 SubClassOf Nothing

FMA_16210 SubClassOf Nothing

http://purl.obolibrary.org/obo/CL_2000062 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606 SubClassOf Nothing

EHDAA2_0001976 SubClassOf Nothing

FMA_83040 SubClassOf Nothing

http://purl.obolibrary.org/obo/CL_0000525 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606 SubClassOf Nothing

EHDAA2_0000259 SubClassOf Nothing

FMA_19088 SubClassOf Nothing

EHDAA2_0001973 SubClassOf Nothing

FMA_59819 SubClassOf Nothing

FMA_85540 SubClassOf Nothing

FMA_19090 SubClassOf Nothing

EMAPA_36617 SubClassOf Nothing

FMA_54756 SubClassOf Nothing

FMA_19092 SubClassOf Nothing

@anitacaron
Copy link
Collaborator Author

anitacaron commented Jul 20, 2023

reports/taxon-constraint-check.txt content

UNSAT: FMA:54687 FMA:54687
UNSAT: c1d7e59879f80e2a1a6e98ac659b47d9-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0011532 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: EHDAA2:0000259 EHDAA2:0000259
UNSAT: FMA:20229 FMA:20229
UNSAT: FMA:19088 FMA:19088
UNSAT: e2da7a06466e71bdee4f659f2b7f3658-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0014850 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: aabc3239c18a36ded84a678b89a935b6-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/CL_0000525 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: a72db1d585113a0f45aa435db9af7978-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/CL_2000062 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: 16f4bce63046a0e7cf14833c53aef5fc-ec28385bb11de0acc278b537c5e1c1d5 'http://purl.obolibrary.org/obo/UBERON_0011094 in taxon http://purl.obolibrary.org/obo/NCBITaxon_7764'
UNSAT: FMA:83040 FMA:83040
UNSAT: FMA:59819 FMA:59819
UNSAT: FMA:85540 FMA:85540
UNSAT: c5220085f0640e5955e39f2f743df122-51e80262c3dfb609db0802da540b0449 'http://purl.obolibrary.org/obo/UBERON_0001074 in taxon http://purl.obolibrary.org/obo/NCBITaxon_7718'
UNSAT: FMA:19092 FMA:19092
UNSAT: FMA:19090 FMA:19090
UNSAT: FMA:54756 FMA:54756
UNSAT: HBA:4413 HBA:4413
UNSAT: MA:0001471 MA:0001471
UNSAT: EHDAA2:0001973 EHDAA2:0001973
UNSAT: EMAPA:36617 EMAPA:36617
UNSAT: 595ea229938610d4a226aacaf0654449-48afe725db6e2f9c8d251393e707d06e 'http://purl.obolibrary.org/obo/UBERON_0008258 in taxon http://purl.obolibrary.org/obo/NCBITaxon_6073'
UNSAT: EHDAA2:0003239 EHDAA2:0003239
UNSAT: bd0266432f93283879782192b3bf3dd4-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0001350 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: EHDAA2:0001976 EHDAA2:0001976
UNSAT: 285d4803f7b54839300fb8f489288fbb-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0011528 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: FMA:16210 FMA:16210
UNSAT: 41af0ecb50dc8394a10ccbeeda6418f0-89a49698565f149a052a9b292cd497d0 'http://purl.obolibrary.org/obo/UBERON_0011511 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9606'
UNSAT: EHDAA2:0001462 EHDAA2:0001462
UNSAT: 1e0e1fc224306e088b9c9b3b60730554-b3ac123c1b5a9db5375c10d2d6f1b8cf 'http://purl.obolibrary.org/obo/UBERON_0036215 in taxon http://purl.obolibrary.org/obo/NCBITaxon_4751'
UNSAT: b2366b8acc1f5595592b82ba4e43ceac-7ea78d70f8067136834749f3bd5885c2 'http://purl.obolibrary.org/obo/UBERON_0001250 in taxon http://purl.obolibrary.org/obo/NCBITaxon_7955'
NUMBER_OF_UNSATISFIABLE_CLASSES: 30

@cmungall
Copy link
Member

cmungall commented Aug 3, 2023

We need to figure a way to work through these more efficiently

The first example is interesting - we could go down the rabbit of hole of what constitutes a tail vs a vestigial tail vs an embryonic bud that fails to develop in human... but we should make fast decisions. I would just weaken the equivalence to FMA to a closeMatch. Mapping to non-human parts of FMA has very little value.

Copy link
Collaborator

@gouttegd gouttegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment below. I don’t understand why we don’t import taxslim-disjoint-over-in-taxon.owl directly instead of importing taxslim.owl and then injecting the missing disjointness axioms at the beginning of the pipeline in a somewhat sneaky way.

src/ontology/uberon.Makefile Outdated Show resolved Hide resolved
Fix the conflicts in uberon.Makefile caused by the overhaul of that
Makefile (which occured while this PR was under way).
@gouttegd
Copy link
Collaborator

gouttegd commented Sep 6, 2023

Fixed the merge conflict, the branch now applies cleanly to master.

There are still some unsatisfiable classes in the taxon constraint check, full explanations attached
. At least one of them is caused by 'arthropod tibia' being part of some 'insect leg', I have not looked at the others yet.

@cmungall
Copy link
Member

cmungall commented Sep 13, 2023 via email

@gouttegd
Copy link
Collaborator

Either we should have another term to represent more generic cortex-like structures beyond mammals, or we should remove (or relax) the mammalian restriction on neocortex.

@gouttegd
Copy link
Collaborator

HBA:4413 is “lateral nucleus of the pulvinar, left”, a regional part of brain. The mapping with Uberon’s “posterior lateral line” is clearly bogus and should be removed.

@gouttegd
Copy link
Collaborator

The notochord is present in all chordates by definition, so any chain of axioms that restricts its existence to vertebrates is wrong.

The problematic link is here, I believe:

axial skeletal system has part some axial skeleton plus cranial skeleton

The latter term is textually defined as the “subdivision of skeleton which consists of cranial skeleton, set of all vertebrae, set of all ribs and sternum” (emphasis mine). That definition makes it vertebrate-specific, so the (non-vertebrate-specific) axial skeletal system should not have as part something that is vertebrate-specific.

@gouttegd
Copy link
Collaborator

Not sure here. Either

@gouttegd
Copy link
Collaborator

AEO’s organism surfacef (sic) is mapped (and therefore, EquivalentTo) to both anatomical surface region and integumental system, making the two Uberon terms equivalent by transitivity.

The mapping with integumental system is clearly bogus and should be removed.

@gouttegd
Copy link
Collaborator

The website given as reference for the existence of the basement membrane of epithelium in homoscleromorphs clearly states that “the homoscleromorphs are unique among sponges in having a basement membrane and, therefore, a true epithelium.“ On that basis, the taxon constraint that restricts epithelium to Eumetazoa should be relaxed.

@gouttegd
Copy link
Collaborator

A “terminology note“ on skull explains that

A skull that is missing a mandible is only a cranium; this is the source of a very commonly made error in terminology.

Assuming this note is correct and that Uberon follows that distinction between “skull” and “cranium” (where “skull“ = “cranium” + “mandible”), it is then an error to say that hagfishes have a “skull” – they should be said to have a “cranium” instead. So the “present in taxon some Myxinidae” axiom above should be on cranium, not on skull.

@gouttegd
Copy link
Collaborator

Judging from this abstract, the taxon constraint can be relaxed to all cartilaginous fishes (Chondrichthyes), not just Elasmobranchii.

@cmungall
Copy link
Member

cmungall commented Sep 28, 2023 via email

@gouttegd
Copy link
Collaborator

This is a simple mapping error. In Uberon lateral tuberal nucleus is explicitly intended to refer to a structure in human and higher primates, there is a distinct term intended to refer to a similar structure in rodents: tuberal nucleus (sensu Rodentia), where a comment explains:

While lateral tuberal nucleus is often used interchangeably with tuberal nucleus in rodents, lateral tuberal nucleus is specific to humans and higher primates. In some cases, lateral tuberal nucleus is thought to be part of the tuberal nucleus in rodents, and could possibly be homologous to human tuberal region instead.

Of note, this erroneous mapping comes from one of the custom bridges we have with the Allen Mouse Brain Atlas, which are maintained elsewhere and simply used as they come in Uberon. So the fix needs to happen in the remote source.

@gouttegd
Copy link
Collaborator

I am not sure what this “oral subdivision or organism” is. The definition says

A major subdivision of an organism that is the entire side of the organism oral to the plane that divides the organism in two, perpendicular to the oral-aboral axiom.

but I don’t understand what “oral to the plane” means.

But I think a possible fix is to remove this:

mouth overlaps some respiratory system

mouth is seemingly intended to apply broadly (it is taxon-constrained only to Eumatazoa), so it seems wrong to assert that it necessarily overlap with (and thus, implies the existence of) a respiratory system.

@gouttegd
Copy link
Collaborator

A self-contained one:

If that structure does exist in salamanders (the axiom asserting that references this book: https://isbnsearch.org/isbn/0471888893, that I cannot check), then the taxon constraint should be relaxed to one level higher (Tetrapoda), so as to cover both Amniota and Amphibia.

@gouttegd
Copy link
Collaborator

Those were all the unsats detected by the new check enforcing taxon constraints. I’ll start fixing them in the second half of October. Of course if anyone wants to start before that, you’re more than welcome to do so.

Note that one fix must happen in CL, and another one must happen in ABA_Uberon (where the custom bridge between Uberon and the Mouse Brain Atlas is maintained). All other unsats are fixable directly within Uberon.

@anitacaron
Copy link
Collaborator Author

Thank you so much for all the explanations, @gouttegd. I'll create some tickets from your comments that might be related to HuBMAP and assign them to @aleixpuigb.

@balhoff
Copy link
Member

balhoff commented Sep 29, 2023

I am not sure what this “oral subdivision or organism” is. The definition says

A major subdivision of an organism that is the entire side of the organism oral to the plane that divides the organism in two, perpendicular to the oral-aboral axiom.

but I don’t understand what “oral to the plane” means.

@gouttegd oral–aboral is a body axis used in various invertebrate groups that don't have bilateral symmetry, such as echinoderms. See https://www.ebi.ac.uk/ols4/ontologies/bspo/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FBSPO_0000198

I think “oral subdivision or organism” should be “oral subdivision of organism”.

@gouttegd gouttegd mentioned this pull request Oct 12, 2023
2 tasks
@gouttegd
Copy link
Collaborator

At the Uberon call on 16 October 2023, it was decided that it would be better if the disjointness axioms between taxa were actually included in the released artefacts, instead of being only present during the QC pipeline to enforce taxon constraints. The recommendation is therefore to make a SMLE module out of the taxslim-disjoint-over-in-taxon.owl file and merge it as with any other import during the main pipeline. This should ensure that all the disjointness axioms required for the taxon constraint check are always present (during QC and in the released artefacts), without increasing the size of the released artefacts too much.

The straightforward way to do that would be to edit the ODK configuration so that we import http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim-disjoint-over-in-taxon.owl instead of http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl. Then everything would be done by the standard ODK import pipeline, which is already configured to perform a SLME module extraction.

Except that I tested doing just that, and it does not work. :( What’s extracted from taxslim-disjoint-over-in-taxon.owl is not enough for the taxon constraint check to detect all violations.

I am not sure why, but I suspect this is because the various macros that expands to in_taxon relationships are not expanded at the time the import seed is generated, so not all taxa required for taxon constraint checks are listed in that seed and therefore the extracted module is incomplete.

Not sure yet how to best fix this. Clearly we can’t use the standard, ODK-generated import pipeline; we need an ad-hoc rule specifically for the taxslim-disjoint-over-in-taxon.owl “import”.

I am not keen on delaying this PR until we come up with a satisfactory way of ensuring both that the taxon constraint check works and that the disjointness axioms are present in released artefacts. I suggest that for now we proceed with what @anitacaron has done (merging the disjointness axioms only during the QC pipeline) so that we can at least have a working taxon constraint check – which will already be a strict improvement over the current situation. Later, in a distinct PR, we can try to improve further by making sure the disjointness axioms are also included in release products.

@gouttegd
Copy link
Collaborator

gouttegd commented Oct 26, 2023

Regarding this unsat:

Fixing it by moving the “present in taxon some Myxinidae” axiom to cranium, as proposed in the linked comment, is not enough, because cranium

Since cranium exists throughout all craniates, it should not be said to be part of some skull which is (through skeleton of lower jaw) specific to jawed vertebrates.

@gouttegd
Copy link
Collaborator

Regarding this one:

Even if we relax the taxon constraint on dermatocranium as suggested, there is still another path to unsatisfiability, because external naris

For this one I don’t have a fix that I would be really happy with. I am considering removing

nasal skeleton SubClassOf: part of some facial skeleton

to break the link that ultimately makes the nose specific to jawed vertebrates, which seems to me the “least bad” fix.

@gouttegd
Copy link
Collaborator

This one also has another path to unsatisfiability, even after removing the link between mouth and respiratory system. That’s because oral subdivision of organism

An homology note on mouth states that

Molecular and developmental cell lineage data suggest that the acoel mouth opening is homologous to the mouth of protostomes and deuterostomes and that the last common ancestor of the Bilateria (the 'urbilaterian') had only this single digestive opening. [well established]

which suggests to me mouth should not be part of some head, since the head only exists in Bilateria.

Copy link
Member

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, I approve of both ontology changes and all pipeline changes.

I agree with @gouttegd's course of action on 1 vs 2. And keeping the original inference as a taxon GCI makes it easy to review these all en masse at some future date

@gouttegd
Copy link
Collaborator

Superseded by #3102, which has now been merged.

@gouttegd gouttegd closed this Nov 10, 2023
@anitacaron anitacaron deleted the anitacaron/issue2707 branch January 24, 2024 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Taxon disjointness unsat not showing up in pipeline Add present_in_taxon into QC
4 participants