title | parent | layout | nav_order |
---|---|---|---|
Using the Modeling Language |
Biolink Model Guidelines |
default |
3 |
- Overview
- Inheritance Related Slots
- Identification, Descriptive and Indexing Related Slots
- Slots Relating to Class Composition
- Slots Relating to Constraints on Slot Composition
- Slots Relating Semantic Mappings and Anchoring to External Ontology
The linkML provides a variety of slots to define the semantics of your Biolink Model class and slots.
This document tries to address on how to use most of these slots in Biolink Model.
Please refer to linkML documentation for an exhaustive list of slots provided by the modeling language.
The is_a
slot can be used to define a hierarchy for your Biolink Model class, mixin or slot where a new class, mixin or slot is defined as a subclass of another defined class, mixin or slot.
gene:
is_a: gene or gene product
Here we define that the entity class gene
is a sub-class of gene or gene product
. Note that is_a
has the characteristics of homeomorphicity: is_a
SHOULD only connect either (1) two mixins (2) two classes (3) two slots.
A model class (or slot) may be tagged with its abstract
slot set to the boolean value true
, to define whether it is abstract. This has comparable meaning to that in the computing science Object Oriented Paradigm: another class (or slot) can use the abstract class (or slot) as part of its inheritance hierarchy, but the abstract class itself cannot be directly instantiated.
cell line to thing association:
is_a: association
defining_slots:
- subject
abstract: true
description: >-
A relationship between a cell line and another entity
slot_usage:
subject:
range: cell line
Here we define that the association class cell line to thing association
is an abstract class. In this case, the class simply constrains its child subclasses to have a subject range of biolink:CellLine
.
The mixin:true
demarkation is used to extend the properties (or slots) of a class, without changing its
position in the class hierarchy. Mixins can be extremely helpful in a number of ways: 1) to generalize a set
of attributes that can apply to classes in different parts of the class hierarchy, 2) reduce duplication of
shared attributes between classes that do not inherit from one another and 3) to prevent the sometimes confusing nature
of multiple inheritance noted in the '[diamond problem]'(https://tinyurl.com/4zdw9tsb).
In general, while mixin slots and classes should not be directly instantiated, or used directly as a slot in a class, KGs can use them as a substitute for multiple inheritance. For example, a KG might wish to determine what are the parents of a certain class. In this case, the KG should navigate a mixin used in a domain or range of a class or slot, as it would the "is_a" demarkation.
thing with taxon:
mixin: true
description: >-
A mixin that can be used on any entity with a taxon
slots:
- in taxon
Here we define the class thing with taxon
as a mixin class with a slot in taxon
.
molecular entity:
is_a: biological entity
mixins:
- thing with taxon
- physical essence
- ontology class
aliases: ['bioentity']
In the class molecular entity
, we use the thing with taxon
mixin in order to add the 'in taxon' attribute (slot)
to the molecular entity class. The other way to do this would be to duplicate the 'in taxon' attribute in every class
manually (duplicative), or have hierarchy/parent that had the in taxon
slot (but this parent would be a sister-class to
'named thing' as not all named-things are taxon based). Mixins simplify modeling and should be used when necessary.
regulates:
is_a: affects
comments:
- This is a grouping for process-process and entity-entity relations
mixin: true
Here we define the slot regulates
as a mixin slot. This slot can be used as a mixin
by other slots.
Mixins can also be hierarchical. For example:
frequency qualifier mixin:
mixin: true
description: >-
Qualifier for frequency type associations
slots:
- frequency qualifier
Here we define the mixin frequency qualifier mixin
to hold the parent slot, frequency qualifier.
The slot, frequency qualifier
is then inherited by every class in the subsequent is_a
hierarchy of
entity to feature or disease qualifiers mixin.
The frequency quantifier
mixin was created with similar
intentions (favoring consistency in modeling similar domains), though its reuse is not as evident in the model yet.
Mixins provide the means of reusing semantics, generally by the inclusion of specific property slots or other semantic constraint, in different classes or slots, without the need to tie slots to the hierarchy of the class itself.
positively regulates:
comments:
- This is a grouping for positive process-process and entity-entity regulation.
is_a: regulates
inverse: positively regulated by
mixin: true
close_mappings:
# This RTX contributed term is tagged as a inverse of this Biolink predicate
- RO:0002336
exact_mappings:
- RO:0002213
narrow_mappings:
- CHEMBL.MECHANISM:activator
- DGIdb:activator
- RO:0004032
- RO:0004034
- RO:0002629
- SEMMEDDB:augments
positively regulates
is another example of a mixin. In this case, a mixin is used to store meta data about a
predicate or relationship between two entities at a general level. Its subsequent children, inherit these definitions
and attributes, whether or not the parent mixin class has any slots.
The mixins
slot can be used to specify a list of mixins that a class (or slot) can use to
include the added semantics of the mixins.
The mixins
are separate from the is_a
hierarchy and the mixin classes do not contribute to a
classes inheritance hierarchy.
individual organism:
is_a: organismal entity
mixins:
- thing with taxon
Here we define an entity class individual organism
that uses the mixin class thing with taxon
.
By virtue of the mixin, the class individual organism
will have an in taxon
slot in addition to
all its own slots, its parent slots, and its ancestor slots.
The aliases
slot can be used to define a list of aliases for a Biolink Model class (or slot). This is useful for adding synonymous names to your class (or slot).
gene:
is_a: gene or gene product
aliases:
- locus
Here we define that the entity class gene
has an alias locus
.
The description
slot can be used to provide a human-readable description of a class (or slot).
genetically interacts with:
is_a: interacts with
description: >-
holds between two genes whose phenotypic effects are dependent on each other in some way - such that their combined phenotypic effects are the result of some interaction between the activity of their gene products. Examples include epistasis and synthetic lethality.
domain: gene
range: gene
Here we define a human readable description that describes the predicate slot genetically interacts with
and its purpose.
The slot_uri
slot can be used to define a canonical URI that is the true representation for that particular slot. That is, the value of slot_uri
can be used interchangeably with the slot being defined.
name:
is_a: node property
aliases: ['label', 'display name']
domain: named thing
range: label type
slot_uri: rdfs:label
Here we define rdfs:label
as the canonical URI for the property slot name
. When serializing a graph into RDF, the name of an instance of entity class named thing
will be represented using rdfs:label
instead of biolink:name
.
This is to ensure that we use certain core RDF predicates as is.
The in_subset
slot can be used tag your class (or slot) to belong to a pre-defined subset.
The actual subset names are defined as part of the Schema definition.
genetically interacts with:
is_a: interacts with
domain: gene
range: gene
in_subset:
- translator_minimal
Here we define the predicate slot genetically interacts with
as part of the translator_minimal
subset.
The id_prefixes
slot can be used to define a list of valid ID prefixes that instances of this class ought to have as part of their CURIE.
The order of the list matters since its a prioritized list with the ID prefix with the highest priority appearing at the top of the list.
gene:
is_a: gene or gene product
aliases: ['locus']
slots:
- id
- name
- symbol
- description
- synonym
- xref
mappings:
- SO:0000704
- SIO:010035
- WIKIDATA:Q7187
id_prefixes:
- NCBIGene
- ENSEMBL
- HGNC
- UniProtKB
- MGI
- ZFIN
- dictyBase
- WB
- WormBase
- FlyBase
- FB
- RGD
- SGD
- PomBase
Here we define the entity class gene
to have a list of ID prefixes with NCBIGene
having the highest priority.
The slot
property list enumerates the names of slots which a given class, mixin or its subclasses are generally permitted to have. Unless it is designated as one of the defining_slots
(see below) or slot_usage
(see below) specifies that a given slot is required: true
(see below), then it is not mandatory that such a slot is instantiated in all instances of the given class, mixin or subclass inheriting it.
The defining_slots
slot can be used to specify which slots of an instance are necessary for defining an instance as a member of a class.
gene to gene association:
is_a: association
defining_slots:
- subject
- object
Here we specify that an association can be determined to be an instance of class gene to gene association
based on the semantics of two of its slots: subject
and object
.
i.e. One can infer an association to be an instance of gene to gene association
if both its subject
and its object
are an instances of class gene
.
Listing a slot as one of the defining_slots
slots effectively makes it required: true
(see below).
The slot_usage
slot can be used to specify how a particular slot ought to be used in a class.
This is useful for documenting what a particular slot means for instances of a particular class.
gene to gene association:
aliases: ['molecular or genetic interaction']
is_a: association
defining_slots:
- subject
- object
description: >-
abstract parent class for different kinds of gene-gene or gene product to gene product relationships.
Includes homology and interaction.
slot_usage:
subject:
range: gene or gene product
description: >-
the subject gene in the association. If the relation is symmetric, subject vs object is arbitrary.
We allow a gene product to stand as proxy for the gene or vice versa
object:
range: gene or gene product
description: >-
the object gene in the association. If the relation is symmetric, subject vs object is arbitrary.
We allow a gene product to stand as proxy for the gene or vice versa
Here we document the association class gene to gene association
with information on how the slot subject
and object
ought to be used to represent this association properly.
In the slot_usage
section we define the range and provide a description for the slot subject
and object
.
The required
slot can be used to define whether a slot is required.
When a slot is declared as required, any class that uses that slot must have a value for that slot.
id:
is_a: node property
required: true
domain: named thing
mappings:
- alliancegenome:primaryId
- gff3:ID
- gpi:DB_Object_ID
Here we define the property slot id
as a required field for all instances of the entity class named thing
.
The domain
slot mimics the idea of rdfs:domain
where you constrain the type of classes that a given Biolink Model slot can be a part of.
genetically interacts with:
is_a: interacts with
domain: gene
Here we define that the subject (source node) of the predicate slot genetically interacts with
must be an instance of class gene
.
The range
slot mimics the idea of rdfs:range
where you can constrain the type of classes (or data types) a given Biolink Model slot can have as its value.
genetically interacts with:
is_a: interacts with
domain: gene
range: gene
Here we define that both the subject (source node) and object (target node) of the predicate slot genetically interacts with
must be instances of class gene
.
The symmetric
slot can be used to specify whether a Biolink Model predicate slot is symmetric in its semantics.
i.e. if A -[r]-> B
and r
is symmetric then one can infer B -[r]-> A
genetically interacts with:
is_a: interacts with
domain: gene
range: gene
in_subset:
- translator_minimal
symmetric: true
Here we define that the predicate slot genetically interacts with
is symmetric.
Note: This property is not inherited by descendants of this predicate slot. You will have to explicitly define every predicate slot that should be considered as symmetric.
The symmetric
slot can be used to specify whether a given predicate slot is symmetric.
interacts with:
domain: named thing
range: named thing
description: >-
holds between any two entities that directly or indirectly interact with each other
is_a: related to
in_subset:
- translator_minimal
symmetric: true
Note: The symmetric nature of the predicate is not inherited by descendants of the predicate.
The inverse
slot can be used to specify the inverse predicate of a given predicate slot relationship.
affects:
is_a: related to
description: >-
describes an entity that has a direct affect on the state or quality
of another existing entity. Use of the 'affects' predicate implies that
the affected entity already exists, unlike predicates such as
'affects risk for' and 'prevents, where the outcome is something
that may or may not come to be.
inverse: affected by
in_subset:
- translator_minimal
The exact_mappings
slot can be used to define external concepts, predicates, or properties which are considered to be exact mappings to the class (or slot) being defined.
same as:
is_a: exact match
description: >-
holds between two entities that are considered equivalent to each other
in_subset:
- translator_minimal
exact_mappings:
- owl:sameAs
- skos:exactMatch
- WIKIDATA_PROPERTY:P2888
- CHEMBL.MECHANISM:equivalent_to
- MONDO:equivalentTo
Here we define a list of 5 predicates that are semantically equivalent to the Biolink Model predicate slot same as
.
The close_mappings
slot can be used to define external concepts, predicates, or properties which are considered to be close mappings to the class (or slot) being defined.
same as:
is_a: exact match
description: >-
holds between two entities that are considered equivalent to each other
in_subset:
- translator_minimal
exact_mappings:
- owl:sameAs
- skos:exactMatch
- WIKIDATA_PROPERTY:P2888
- CHEMBL.MECHANISM:equivalent_to
- MONDO:equivalentTo
close_mappings:
- owl:equivalentClass
Here we define owl:equivalentClass
as being a close match to the Biolink Model predicate slot same as
.
The narrow_mappings
slot can be used to define external concepts, predicates, or properties which are considered to be narrow mappings to the class (or slot) being defined.
same as:
is_a: exact match
description: >-
holds between two entities that are considered equivalent to each other
in_subset:
- translator_minimal
close_mappings:
- owl:equivalentClass
exact_mappings:
- owl:sameAs
- skos:exactMatch
- WIKIDATA_PROPERTY:P2888
- CHEMBL.MECHANISM:equivalent_to
- MONDO:equivalentTo
narrow_mappings:
- DRUGBANK:external-identifier
Here we define DRUGBANK:external-identifier
as being a narrow match to the predicate slot same as
.
By narrow we mean that the scope of DRUGBANK:external-identifier
is more narrower and restrictive than same as
.
If we were to create a new predicate slot as a proxy for DRUGBANK:external-identifier
then that new slot would be a child of same as
.
The broad_mappings
slot can be used to define external concepts, predicates, or properties which are considered to be broad mappings to the class (or slot) being defined.
in complex with:
description: >-
holds between two genes or gene products that are part of (or code for products that are part of) in the same macromolecular complex
is_a: coexists with
domain: gene or gene product
range: gene or gene product
in_subset:
- translator_minimal
broad_mappings:
- SIO:010285
Here we define SIO:010285
(molecular complex formation) as a broad mapping to the predicate slot in complex with
.
By broad we mean that the scope of SIO:010285
is more broad and relaxed than in complex with
.
If we were to create a new predicate slot as a proxy for SIO:010285
then that new slot would be a parent of in complex with
.
The related_mappings
slot can be used to define external concepts, predicates, or properties which are considered to be related mappings to the class (or slot) being defined.
in complex with:
description: >-
holds between two genes or gene products that are part of (or code for products that are part of) in the same macromolecular complex
is_a: coexists with
domain: gene or gene product
range: gene or gene product
in_subset:
- translator_minimal
broad_mappings:
- SIO:010285
related_mappings:
- SIO:010497
Here we define SIO:010497
(protein complex) as a related mapping to the predicate slot in complex with
.
By related we mean that the scope of SIO:010497
is related to the predicate slot in complex with
and it's difficult to infer any further granularity.
The subproperty_of
slot can be used (typically, under slot_usage
) to anchor the values of a Biolink predicate
slot of an association to a particular predicate (and its subclasses) other than the top-most predicate, biolink:related_to
.
gene to gene homology association:
is_a: gene to gene association
slot_usage:
predicate:
subproperty_of: homologous to
Here, the predicate
of the biolink:GeneToGeneHomologyAssociation
is constrained to a value the subtree of predicates of biolink:homologous_to
or its subclasses.
The subclass_of
slot can be used to anchor the semantics of a Biolink class to a particular term in an external 3rd party ontology.
named thing:
description: "a databased entity or concept/class"
slots:
- id
- name
- category
subclass_of: BFO:0000001
Here, biolink:NamedThing
is anchored to ontology term BFO:0000001
- Entity of the Basic Formal Ontology which implies all is_a
specified subclasses of biolink:NamedThing
are also a subclasses of BFO:0000001
.