-
Notifications
You must be signed in to change notification settings - Fork 0
Harmonisation with the (Translator) Biolink Model ontology standards
To harmonize the RKB concept type and predicate (edge) labels to the (Translator) Biolink Model ontology standards.
The semantics of concepts in this database was simply tagged using the UMLS Metamap "Semantic Group" tags, not the fine grained UMLS semantic types.
For legacy reasons, this semantic group tagging of the Concept nodes in the database have two fields with identical semantic grou: type and semanticGroup. Aside from normalizing and renaming these fields to the single "category" (as per the new TKG data model), we will substitute the Biolink Model concept type for each Semantic Group, as per the following mapping. The current RKB lacks the original fine grained UMLS semantic types. Were they present, then they could perhaps be captured by the proposed secondary "TYPE" edge to Concept Type annotation.
The original RKB used the simple labels from the UMLS MetaMap Semantic Groups. A direct string replacement (using Cypher) is applied to convert these to the Biolink Model concept types. The mapping applied is as follows:
Code | Description | Biolink Term |
---|---|---|
OBJC | Objects | named thing |
ACTI | Activities & Behaviors | activity and behaviour |
ANAT | Anatomy | anatomical entity |
CHEM | Chemicals & Drugs | chemical substance |
CONC | Concepts & Ideas | information content entity |
DEVI | Devices | device |
DISO | Disorders | disease |
GENE | Genes & Molecular Sequences | genomic entity |
GEOG | Geographic Areas | geographic location |
LIVB | Living Beings | organismal entity |
OCCU(*) | Occupations | named thing |
ORGA | Organizations | administrative entity |
PHEN | Phenomena | phenomenon |
PHYS | Physiology | physiology |
PROC | Procedure | procedure |
(*) not directly tracked in Biolink - concepts & related statements may be removed?
The RKB Predicate nodes are tagged with Wikidata P# property identifiers (base URI "https://www.wikidata.org/wiki/Property:") in their 'accessionId' property field. These predicate values were mapped onto the Semantic Medline Database records somehow. These need to be rewritten to the corresponding Translator Biolink Model predicates terms, as in the table below:
Property Id | Name | Biolink Term |
---|---|---|
wd:P3356 | positive diagnostic predictor | |
wd:P129 | physically interacts with (in molecular biology) | molecularly interacts with |
wd:P279 | subclass of | subclass of |
wd:P276 | location | |
wd:P1557 | manifestation of | biological process |
wd:P361 | part of | part of |
wd:P156 | followed by | |
wd:P1056 | product | |
wd:P2888 | exact match | |
wd:P2175 | medical condition treated | |
wd:P2283 | uses | |
wd:P1542 | cause of | |
kb:P2176 | drug used for treatment | |
wd:P703 | found in taxon | |
wd:P688 | encodes | |
wd:P684 | ortholog | molecularly interacts with |
wd:P682 | biological process | biological process |
wd:P681 | cell component | cellular component |
wd:P680 | molecular function | |
wd:P3433 | biological variant of | sequence variant(?) |
wd:P2293 | genetic association | gene to gene association |
wd:P1552 | has quality | |
wd:P128 | regulates (molecular biology) | regulates, entity to entity |