Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can biological and technical replicates be used for environmental metagenome samples? #1693

Open
cmungall opened this issue May 4, 2023 · 11 comments
Assignees

Comments

@cmungall
Copy link
Contributor

cmungall commented May 4, 2023

We are looking into using replicate roles for samples derived from environments, e.g. a sample drawn from a lake with the intention of sequencing (or even a non-biological chemical analysis).

Looking at the labels used in the hierarchy it looks like this is wired for humans or at least whole organisms:

  • OBI:0000097 participant under investigation role - A role that is realized through the execution of a study design in which the bearer of the role participates and in which data about that bearer is collected.
    • OBI:0000220 reference subject role - a reference subject role which inheres in an organism or entity of organismal origin so that the characteristics or responses of the participant playing the reference participant role are used for comparison or reference
      • OBI:0000198 biological replicate role - a reference participant role realized by equivalent treatment of participants
      • OBI:0000249 technical replicate role - technical replicate role is realized when two portions from one evaluant are used in replicate runs of an assay

Although not explicitly forbidden, the language ("participant", "subject") seems exclusive of environmental samples.

However if we look deeper at example of usage in OBI:0000097, we see that in fact it is intended to be inclusive:

Lake example: a lake could realize this role in an investigation that assays pollution levels in samples of water taken from the lake.

It seems odd to call a lake a "participant", not everyone speaks BFO.

But then the lake example is inconsistent with the definition on the direct parent (OBI:0000220), "a reference subject role which inheres in an organism or entity of organismal origin"

But then again I see SubClassOf axioms on all these classes that is somewhat more inclusive:

'inheres in' some 
    (organism or specimen)

This is inclusive enough for my use case (samples taken from lakes) but is still more restrictive than the example given for OBI:0000097 (unless entire lakes can be specimens?)

I propose one of two paths:

  1. conservative. broaden the labels and definitions of parent and ancestor classes to avoid implicitly or explicitly excluding lakes etc
  2. move to a less abstract role hierarchy and have more user-friendly groupings like "sample role"

As an aside (and feel free to ignore this part of move to a separate issue), I find the level of abstraction in this hierarchy hard to mentally reason over:

  • [] OBI:0000097 ! participant under investigation role
    • [i] OBI:0002493 ! control role in case-control study
    • [i] OBI:0002492 ! case role in case-control study
    • [i] OBI:0000825 ! to be treated with placebo role
    • [i] OBI:0000813 ! to be treated with active ingredient role
    • [i] OBI:0000220 ! reference subject role
      • [i] OBI:0000249 ! technical replicate role
      • [i] OBI:0000198 ! biological replicate role
        • [i] OBI:0000252 ! cohort role
      • [i] OBI:0000161 ! crossover population role
      • [i] OBI:0000143 ! baseline participant role

Single is-a is an anti-pattern, and in this case I am having some issues thinking of cohort role as a subclass of biological replicate. I guess this is justified given the very broad definition of biological replicate in OBI. But I don't understand the use case for having such broad abstract inclusive concepts. What's the use case for grouping cohorts with replicates of an RNA-Seq sample? Is this a standard way of thinking about this?

I know it's probably annoying to have someone from outside the ontology come along and suggest wholesale rearrangements without knowing the whole history

  • role in an investigation
    • sample role
      • sample replicate role
        * OBI:0000249 ! technical replicate role
        * OBI:0000198 ! biological replicate role
      • ..
    • observational study role
      • OBI:0002493 ! control role in case-control study
      • OBI:0002492 ! case role in case-control study

cc @turbomam

@cmungall
Copy link
Contributor Author

cmungall commented May 4, 2023

Also if anyone has examples of using these classes in aboxes this would be very useful. Also interested in axioms that help us data model - e.g should one sample be prohibited from having multiple biological replicate roles?

@bpeters42
Copy link
Contributor

bpeters42 commented May 4, 2023 via email

@sebastianduesing
Copy link
Contributor

Thank you, Chris. We discussed this on the OBI call today (05.08.23). We agree with your feedback about broadening OBI's terms beyond applications for humans/organisms. We are planning to use an OBI meeting in the future to discuss strategy and implementation for a fix of this issue. Can you and/or @turbomam attend a future OBI call about this?

See also: #1629

@sebastianduesing sebastianduesing self-assigned this May 8, 2023
@cmungall
Copy link
Contributor Author

cmungall commented May 8, 2023 via email

@turbomam
Copy link
Contributor

I'm in the OBI call right now!

@ddooley
Copy link
Contributor

ddooley commented Nov 6, 2023

We're checking "reference subject role" which contains two issues - axiom and label semantics - that would need resolution
reference subject role
Term IRI: http://purl.obolibrary.org/obo/OBI_0000220
Definition: A reference subject role which inheres in an organism or entity of organismal origin so that the characteristics or responses of the participant playing the reference participant role are used for comparison or reference

  1. Definition needs some work.

A more general issue is whether OBI should shift language around "investigation participant" vs "investigation subject"

@ddooley
Copy link
Contributor

ddooley commented Nov 6, 2023

For "reference subject role" I propose a tentative narrow rewrite:

Label: reference subject role
Definition: A participant under investigation role which inheres in a subject such that the subject's characteristics or related information are used for comparison or reference within an investigation.

That definition accommodates assays which are generating data about a subject's characteristics - material or organism - or are collecting other kinds of information, like surveys, or animal behaviours, with a choice of subject to consider as a "reference".

The reference subject role is likely born by some entity which is representative of some material or population, but is not itself a target of an investigation's assay sampling - unless say the investigation is longitudinal, and the reference subject is sampled over time. Right?

@sebastianduesing
Copy link
Contributor

I support this rewrite, though with a (fairly minor) critique about the "such that the characteristics or related information are used for comparison or reference [...]" part of the definition: as you say, it's the data about the subject's characteristics that is used for comparison. Perhaps we could rephrase that portion to say
[...] such that data about the subject is used for comparison or reference [...]
I agree with you that it should be inclusive of other kinds of information, and to me, "characteristics or related information" sounds like the related information is related to characteristics—the upshot being I find it unclear exactly what sort of information qualifies. I lean towards leaving it open rather than enumerating options.
Altogether, then, it'd be:
A participant under investigation role which inheres in a subject such that data about the subject is used for comparison or reference within an investigation.
Thoughts?

@ddooley
Copy link
Contributor

ddooley commented Nov 8, 2023

I like it! Right, its a comparison of data about reference/subjects that we're after!

@cmungall
Copy link
Contributor Author

cmungall commented Nov 8, 2023

Thanks for addressing this. I do think that if you eliminate some of the layers or abstraction here it might be faster to fix these issues, and easier for users to find and use the terms they need. I don't understand the use case for some of these abstract higher level groupings, they don't seem to correspond to normal concepts in fields I'm familiar with. The more layers of abstraction, the more mental reasoning that needs to be done to check consistency between levels, and the the more room for subjective interpretations.

My advice would be to focus on concrete concepts, using standard definitions, and groupings that scientists are familiar with (e.g "replicate" as a grouping for technical and biological replicates)

@bpeters42
Copy link
Contributor

bpeters42 commented Nov 10, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants