Skip to content

The iKnow Approach

Masako Ohira edited this page Feb 9, 2020 · 3 revisions

The iKnow technology analyses written text sentence by sentence, identifies 'entities' and assigns roles (Concept, Relation, PathRelevant or NonRelevant) to all entities. Moreover, it detects semantic attributes (Negation, Time,..).

In the first step, identifying entities, we consider Concepts to be the most important entities. They contain most of the semantic content of a text. From a grammatical point of view, they can be compared to noun phrases. In the NLP world, you can think of them as a type of 'chunks'.

What is the difference between iKnow and other parsers and chunkers?
Our approach. Instead of merging words into the most likely entities, we identify the non-Concept parts and we get the Concepts 'for free'.

Input: The computed axial tomography of the head revealed no abnormalities.
Identify non-Concept elements: The computed axial tomography of the head revealed no abnormalities.
Remaining Concepts: computed axial tomography, head, no abnormalities

Identifying the non-Concept parts is not always straightforward. The English word 'can' is most often an auxiliary verb, and thus a Relation, but it can be a noun (thus Concept) too. We use a combination of a word list and a set of rules to disambiguate possible non-Concepts. We also use rules to detect phrase boundaries in sequences of Concept words.

They published the results | two weeks ago.

A sequence of Concepts, Relations and PathRelevant entities is called a "Path". Paths show the context of the entities, and they create the level on which the scope of semantic attributes is calculated.

Indexed sentence: Two months after the renal failure, the patient was no longer on dialysis.
Path with attribute scopes: [TIME: Two months after the renal failure ], the [NEGATION: patient was no longer on dialysis].

Knowing this, you are ready to dive into the pages about entities and attributes.