Skip to content

Engineering Details

Xuanyu Zhou edited this page Oct 28, 2018 · 1 revision

Engineering details

Structure

The package is composed with

  • A slightly modified ELMo source code, see bilm-tf
  • A main library zoe_utils.py
  • A executor main.py
  • A script helper script.py

zoe_utils.py

This is the main library file which contains the core logic.

It has 4 main component Classes:

EsaProcessor

Supports all operations related to ESA and its data files.

A main entrance is EsaProcessor.get_candidates which given a sentence, returns the top EsaProcessor.RETURN_NUM candidate Wikipedia concepts

ElmoProcessor

Supports all operations related to ElMo and its data files.

A main entrance is ElmoProcessor.rank_candidates, which given a sentence and a list of candidates (generated from ESA), rank them by ELMo representation cosine similarities. (see paper)

It will return the top ElmoProcessor.RANKED_RETURN_NUM candidates.

InferenceProcessor

This is the core engine that does inference given outputs from the previous processors.

The logic behind it is as described in the paper and is rather complicated.

One main entrance is InferenceProcessor.inference which receives a sentence, outputs from previously mentioned processors, and set inference results.

Evaluator

This evaluates performances and print them, after given a list of sentences processed by InferenceProcessor

DataReader

Initialize this with a data file path. It reads standard json formats (see examples) and transform the data into a list of Sentence