-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Rhea to populate logical definitions of reactions #14984
Comments
I looked into this some time ago and creating the logical defs is not as simple as it looks on the surface. However, I do think that we could assert participant relations. Along with the directional issues, there is also stoichiometry to consider to make necessary and sufficient definitions. We don't capture stoichiometry in GO. Most MF terms are defined as being bidirectional. I think our best bet would be to get the directionality from GO-CAM models. There is also an issue with how we have defined processes such as catabolism and biosynthesis. Both have inputs and outputs as differentia. This causes problems because the things being catabolized aren't the only inputs to the process and the things being made aren't the only outputs. See #11779 |
Re: stoichiometry. We actually do stoichiometry in that it figures into the (currently textual) definition. I proposed a way of handling the stoi in the OWL and axiomatizing RHEA at the last barharbord mtg: The assumption then was that we would make the GO classes equivalent to the bidi form, but we can revisit that |
Dealing with charge states - RHEA uses the ChEBI instance that is predominant at pH 7.2, GO is indifferent - will also require some sort of mapping but this is eessentially the mapping already done to aline ChEBI with GO, not a new one. |
And consistent with MetaCyc. |
pH 7.3 ;-) |
@amorgat, could I get a copy of that mapping? It's time to align Reactome better with RHEA and this would let us do the work efficiently and increase the odds that we get the chemicals right. |
Just a few precisions: |
@amorgat Talking about mapping to RHEA, we also need to map to it from the Complex Portal, on the to-do list for later this year (?). We have EC numbers where available, and of course UniProt ACs. Would that help with mapping? If you want to discuss off-ticket, email me on bmeldal @ ebi. ac. uk :) |
Hi Birgit,
To the extent that you’re importing information about complexes from us, we should co-ordinate this – it would be easy on our side to make clean-up of small molecules relevant to you a priority.
Peter
From: Birgit Meldal [mailto:[email protected]]
Sent: Wednesday, January 31, 2018 9:56 AM
To: geneontology/go-ontology
Cc: D'Eustachio, Peter; Comment
Subject: Re: [geneontology/go-ontology] Using Rhea to populate logical definitions of reactions (#14984)
@amorgat<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_amorgat&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=FCJ8Ss3HV4Bv4t5pwSEyfEsEX_H7Qf0HYgfOzLSzd4g&m=rXlknEszp9KOInWNRczFolDrqgKsFxAygDcthHvv9Ug&s=W0WnRFndOZvfy2oaw7Y3X3cewO_gGasY60MiA4YCu3Q&e=> Talking about mapping to RHEA, we also need to map to it from the Complex Portal, on the to-do list for later this year (?). We have EC numbers where available, and of course UniProt ACs. Would that help with mapping? If you want to discuss off-ticket, email me on [email protected]<mailto:[email protected]> :)
Birgit
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_geneontology_go-2Dontology_issues_14984-23issuecomment-2D361956731&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=FCJ8Ss3HV4Bv4t5pwSEyfEsEX_H7Qf0HYgfOzLSzd4g&m=rXlknEszp9KOInWNRczFolDrqgKsFxAygDcthHvv9Ug&s=ZMWE6oVFbn3jFcUesk5P3y1q6Cho23l1Q5q30D-tBWA&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMJcV2jv-2Dv-2DNy2UoorF-5FfJWAQI-5FvunMUks5tQH7-2DgaJpZM4Rykqi&d=DwMFaQ&c=j5oPpO0eBH1iio48DtsedbOBGmuw5jHLjgvtN2r4ehE&r=FCJ8Ss3HV4Bv4t5pwSEyfEsEX_H7Qf0HYgfOzLSzd4g&m=rXlknEszp9KOInWNRczFolDrqgKsFxAygDcthHvv9Ug&s=6cndVn8Kn9-hr63pTwSTyzftUK6rt884Lp_2t9-K_hU&e=>.
…------------------------------------------------------------
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
=================================
|
@deustp01 We'd have to map to RHEA when a complex is catalyst for a certain reaction. It's less about the small molecules than the proteins. But happy to co-ordinate if the mapping suits both! |
Should we try to make this a discussion at the GOC meeting in May? |
Sandra will be there for the CP (as well as UniProt, of course!), I have training responsibilities here that week. |
I’ll be at the GO meeting in NYC and would be happy to discuss on behalf of the Rhea team (and UniProt). (As an aside, we have used directional reactions for the construction and annotation of www.swisslipids.org). |
Ive added it to the GOC meeting agenda on the wiki. |
Hi, yes happy to represent the Complex Portal (and also UniProt) in these discussions. IntAct is also looking at directionality and working with SIGNOR (https://signor.uniroma2.it/) on an export for this. |
We're going to discuss this on the @geneontology/ontology call on monday |
Editor's discussion:
PLAN
|
If anyone is looking at this, its worth noting that the expanded CHEBI import did not include the General class axioms for the new additional CHEBI terms. Hence, e.g., |
Isn't that part of the make_file? We can talk about it on Monday, but it makes sense to me to go ahead and run the ChEBI import with the additional terms we will need for the Rhea defs and if it all looks ok go ahead and merge that into master. It won't hurt anything to have the additional ChEBI classes I don't think. It's one more step we can do to get concrete progress along the way. |
Yes, I assume it is part of the make file and ought to work fine when built
that way. I haven't got set up to do the complete build locally (todo
list..) and thus merged the Robot-generated chebi extract manually into
Protege - thus missing the generation of those axioms. It probably won't
make a lot of difference to our discussion, just may end up missing a few
more inferences.
…On Fri, Jun 29, 2018 at 12:02 PM, David Hill ***@***.***> wrote:
Isn't that part of the make_file? We can talk about it on Monday, but it
makes sense to me to go ahead and run the ChEBI import with the additional
terms we will need for the Rhea defs and if it all looks ok go ahead and
merge that into master. It won't hurt anything to have the additional ChEBI
classes I don't think. It's one more step we can do to get concrete
progress along the way.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14984 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB_U6gExU32RPZsosYUabhRBPIkOsJy8ks5uBnmsgaJpZM4Rykqi>
.
|
Here is one more incarnation to consider for the discussion on Monday. If we walk all the way back to definitions that looks like this (for adenylate cyclase activity): Then things start looking interesting.
This does not take the concept of bidirectionality into account, it is just one of the possible directions, but seems to behave mostly the way we want it to, is much easier to look at it, and fits in better with the rest of the ontology. It may be worth considering taking these directional structures (as they are laid out by default in Rhea) as a starting point (that, though potentially incomplete, is not wrong) and then filling in additional classes for the other directions as they are needed. |
Summarizing status here (and adding some documentation of things happening off github). We are still faced with the challenge of how best to add axioms to define the classes under Catalytic Activity. The principle challenge remains the bidirectional nature of most chemical reactions. This leads naturally to logical constructs that use an OR statement to join the Left-to-right with the Right-to-left. Unfortunately, though sound, this family of definitions does not work with any reasoner that can classify the whole GO. See thread on the ELK reasoner repo about this. For the problem of inferring the class hierarchy, we have two representations on the table, both of which generate some subclass inferences people aren't sure about. See the spreadsheet where these are laid out for the two formulations. The two formulations are
For the problem of inferring the classifications for instances @cmungall suggested a pattern that works nicely using General Concept Inclusion (GCI) axioms. (This does not however influence the problem of class hierarchy inference.) As an example, consider the class ‘phosphoglycerate mutase activity’, which has the textual definition: “Catalysis of the reaction: 2-phospho-D-glycerate = 3-phospho-D-glycerate”. We add the following GCI axiom (and its reverse direction by switching inputs and outputs): 'catalytic activity' Now, when ingesting, for example, the Gluconeogenesis pathway from Reactome, its component reaction ‘2-Phospho-D-glycerate <=> 3-Phospho-D-glycerate’ is correctly and automatically identified as an instance of the GO class ‘phosphoglycerate mutase activity’ based on its inputs and outputs. This example recapitulates a manually assigned GO term from Reactome. Testing with all 11542 reactions imported from Reactome into GO-CAMs (May 2018), these axioms allow for the automatic classification of 2339 (20%). This is an increase from 794 when using the previous GOPlus without the new GCI axioms. 287 of the classifications are exact recapitulations of manual annotations, the remainder are potential new annotations that should be verified. Note that they may be sub or superclasses of existing annotations - only exact matches are tested for currently. Examples of terms used in exact recapitulations:
Examples of new terms used for potential new annotations:
**Note that the GCI definitions require the presence of an assertion to type Catalytic Activity. These are not present in the Reactome data. To produce the above statistics, I used the rule ‘if the reaction has inputs {A,B..} and outputs {C, D..}, and at least one A is a CHEBI term and one C is a CHEBI term and {A,B} is not equal to {C, D} then add Catalytic Activity. For more information about the impact of the GCI axioms on instance classifications from the Reactome import see: To see all the GCI axioms brought in for terms xrefed to RHEA, see GO_Just_GCI_test.ttl.zip (Also has new complete version of chebi_import merged) |
@ukemi and all files needed for review, including the GCI inferences, have been added into worksheets in this google spreadsheet (same referenced above) |
Thanks @goodb. To recap yesterday's discussion with our plan of action. @deustp01 , @hdrabkin and @ukemi will begin with a sanity check of the spreadsheets as a first-pass of the new inferences. We will indicate on the spreadsheet which inferences are correct and which are questionable. Once we have done a pass through of the spreadsheet, we will look at the reasoning behind the questionable inferences and try to determine what is causing the questionable inferences. By the time you return we should be able to provide you with a report of which methods we think are best and whether or not any tweaking is needed. As we meet, we will add our findings to this issue. |
Next steps for Rhea - GO- Reactome roundtrip To do this, can we build a table that, for each Reactome reaction (strictly, reactionlike event) that has a catalyst activity attribute, lists its identifier, the GO molecular function term extracted from the catalyst activity attribute, and the Rhea master reaction cross-referenced to that GO molecular function term. Are there legitimate reasons for any Reactome-to-GO or GO-to-Rhea mappings to be other than one to one? With that table, we will be able to retrieve the lists of molecules associated with the Rhea and Reactome versions of the event and their stoichiometries, align them, and identify discrepancies. Will the tables already constructed by GO (GOCHE?) be useful here for making alignments when Rhea and Reactome disagree on charge state or stereochemistry? Can the pathway context of each Reactome reaction be used to identify the direction of that reaction when the reaction is part of the corresponding biological process? |
Reactome-to-GO can be many to one, since each reactome ID represents an
instance. However, it should not be one to many.
GO-to-RHEA should be one-to-one. We should implement a check for this.
There may be some leaf GO MFs that lack a RHEA. We can request these. There
will be some RHEAs that have no equivalent in GO. We can add these manually
for now on an as-needed basis, but we will later implement a system where
this is semi-automated.
…On Wed, Apr 24, 2019 at 8:47 AM deustp01 ***@***.***> wrote:
Next steps for Rhea - GO- Reactome roundtrip
Each Reactome reaction instance that has a catalyst (or transporter)
activity attribute maps to a single GO molecular function term, and from
that term to a set of the four Rhea reactions that represent the four
possible directions of the molecular transformation enabled by the GO
activity. Those mappings can be used to find discrepancies in
stoichiometry, participation of water and protons, and ionization states
and stereochemistries, between the Rhea and Reactome versions of reactions,
in a form that should allow the Reactome reactions to be edited to conform
to their Rhea counterparts with minimal manual intervention.
To do this, can we build a table that, for each Reactome reaction
(strictly, reactionlike event) that has a catalyst activity attribute,
lists its identifier, the GO molecular function term extracted from the
catalyst activity attribute, and the Rhea master reaction cross-referenced
to that GO molecular function term. Are there legitimate reasons for any
Reactome-to-GO or GO-to-Rhea mappings to be other than one to one?
With that table, we will be able to retrieve the lists of molecules
associated with the Rhea and Reactome versions of the event and their
stoichiometries, align them, and identify discrepancies. Will the tables
already constructed by GO (GOCHE?) be useful here for making alignments
when Rhea and Reactome disagree on charge state or stereochemistry?
Can the pathway context of each Reactome reaction be used to identify the
direction of that reaction when the reaction is part of the corresponding
biological process?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14984 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAMMON7BJRHTO2DTWYSFULPSB6KFANCNFSM4EOKJKRA>
.
|
Resolution is to use the simpler construct that uses ObjectUnionOf (forward, backward) that inspired this request to the ELK team liveontologies/elk-reasoner#54 . @balhoff has a solution. |
Axioms to be added to a distinct file go-mf-defs.owl that will be imported into go-plus. |
@cmungall if we go with the union-based approach, is there any reason we need the intermediate "substance sets/bags" for inputs and outputs? Am I forgetting something? I think this would work:
|
I think this works
…On Wed, May 6, 2020 at 12:59 PM Jim Balhoff ***@***.***> wrote:
@cmungall <https://github.com/cmungall> if we go with the union-based
approach, is there any reason we need the intermediate "substance
sets/bags" for inputs and outputs? Am I forgetting something? I think this
would work:
(
(catalytic_activity
and (has_input some (CHEBI_1 and has_stoich value “2”))
and (has_input some (CHEBI_2 and has_stoich value “1”))
and (has_output some (CHEBI_3 and has_stoich value “2”))
and (has_output some (CHEBI_4 and has_stoich value “1”)))
or
(catalytic_activity
and (has_output some (CHEBI_1 and has_stoich value “2”))
and (has_output some (CHEBI_2 and has_stoich value “1”))
and (has_input some (CHEBI_3 and has_stoich value “2”))
and (has_input some (CHEBI_4 and has_stoich value “1”)))
)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14984 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMMOKFTRPSFHIZ3EKAY6DRQG56ZANCNFSM4EOKJKRA>
.
|
If I understand correctly, input/output is equivalent to substrate/product, right? i.e a directed reaction. |
@amorgat the goal of the definition above is to capture the meaning of the undirected reaction - the union groups both directions into one class. I believe the intent is to limit the mapping to the parent undirected reaction from rhea. |
Current status:
I think this is sufficient. Adding logical definitions for grouping reactions outside what can be done in OWL |
Hello,
Discussing with @amorgat about how Rhea and GO represent biochemical reactions, here are a few points to consider:
I am putting this here for discussion - @ukemi I let you decide how soon we need to discuss this.
Thanks, Pascale
The text was updated successfully, but these errors were encountered: