Skip to content

Entities

ISC-SDE edited this page Jan 30, 2020 · 4 revisions

iKnow's primary function is to identify phrase boundaries that define Entities, entirely based on the syntactic structure of the sentences, rather than relying on an upfront dictionary or pre-trained model. This makes iKnow well-suited for initial exploration of a new corpus.

iKnow Entities are not Named Entities in the NER sense, but rather the word groups that need to be considered together, representing a concept or relationship as coined by the text author in its entirety. The following examples clearly show the importance of this phrase level to fully capture what the author meant:

iKnow Entity Meaning
Dopamine small molecule
Dopamine receptor drug target
Dopamine receptor antagonist chemical drug
Dopamine receptor gene gene, molecular sequence
Dopamine receptor gene mutation physiological process

iKnow will label every entity with a simple role that is either Concept (usually corresponding to Noun Phrases in POS lingo) or Relation (verbs, prepositions, ...). Typical stop words that have little meaning of their own get categorized as PathRelevant (e.g. pronouns) or NonRelevant parts, depending on whether they play a role in the sentence structure or are just linguistic fodder.

In the following sample sentence, we've highlighted Concepts, Relations and PathRelevants separately.

Belgian geuze is well-known across the continent for its delicate balance.