-
Notifications
You must be signed in to change notification settings - Fork 17
Tasks
At OSA-alpha, we are aiming at solving five important tasks in Conversational Intelligence. These tasks have been developed keeping in mind their business utility and scientific contribution. The following are the tasks
1. A Deep Conversational Framework (Ovation-CI): Build a deep learning based Conversational Intelligence Framework that is actively trainable, stand-alone deployable, RESTful and easily extensible.
2. Core Components of Ovation-CI: Develop the building blocks of a Conversational Intelligence framework, which are,
3. Business Use-cases: Develop demos for three business use-cases that will be presented at OSA-alpha using some existing open-source frameworks like Rasa.
4. Ovation Voice: Develop demos for the same use-cases (as mentioned above) using the Ovation Voice Interface.
The following sections will describe each of them in detail.
Every chatbot is built upon a Conversational Intelligence framework that is composed of many components. These components enable the chatbot with various functionalities like (1) classifying the intent of what the user said (2) extracting the entities of in the users' statement (3) extracting the sentiment of the users' statement, (4) generating a response or sampling a response from a predefined set of responses, as a reply to the users' statement (query). A conversational Intelligence framework is not limited to just these components. It can have any other component that helps in making the conversation more natural with a user or extract information from what the user said.
This framework needs to have four important features, which are,
1. Active Learnability: A chatbot developer should be able to develop new chat scenarios and train a new model using Ovation-CI whenever he/she wants.
2. Stand-Alone Deployability: Ovation-CI should be deployable as a server and used stand-alone. Ideally, its components should be modular enough to give scope for heterogeneity.
3. RESTful: Ovation-CI should be accessible by RESTful APIs. E.g., you should be able to make an API call like http://<your-domain>/ovation?q="hello Ovation"
and get a response like
{
"query": "hello Ovation",
"sentiment": {"score": 0.5, "coarse": "neutral", "cause": [] },
"intents": [
{
"intent": "greetings",
"intent_id": 1,
"confidence": 0.85
},
{
"intent": "welcome",
"intent_id": 2,
"confidence": 0.15
}
],
"entities": [
{
"start": 6,
"end": 12,
"value": "Ovation",
"entity": "organisation",
"confidence": 0.76
}
]
}
4. Easily Extensible: Ovation-CI should be easily extensible by having scope for adding new components to its processing pipeline (described below).
The Ovation-CI architecture that we have in mind is shown in the image below. It is just for reference and not for developing something exactly like this. We want our participants to come up with innovative ideas and extend the architecture mentioned below and develop it.
The following are the components of the above architecture,
An endpoint is a REST API available in the Ovation-CI server. These APIs are the endpoints for any communication with it. The following are the endpoints that are mentioned in the above architecture.
1. /train can be called when a chatbot developer wants to develop his/her own model (e.g., an insurance enquiry bot). This endpoint receives data in the format
{
"data": [
{
"text": "yes",
"intent": "affirm",
"entities": []
},
{
"text": "yep",
"intent": "affirm",
"entities": []
},
{
"text": "yeah",
"intent": "affirm",
"entities": []
},
{
"text": "Techniker Krankenkasse offices in Kaiserslautern",
"intent": "inquiry",
"entities": [
{
"start": 34,
"end": 48,
"value": "Kaiserslautern",
"entity": "location"
},
{
"start": 0,
"end": 22,
"value": "Techniker Krankenkasse",
"entity": "organisation"
}
]
}
]
}
We used the above example from rasa-nlu-trainer, which can be used to generate data in this format. This data format is also an example and we expect our participants to come up with their own innovative ideas for structuring the data too.
/train
invokes the pipeline, which processes the data that /train
receives and trains the individual Blocks of the pipeline.
2. /ovation can be called when a users' query needs to be processed. Given a query like http://<your-domain>:<port>/ovation/q="hello ovation"
this endpoint should return a json like
{
"query": "hello Ovation",
"intents": [
{
"intent": "greetings",
"intent_id": 1,
"confidence": 0.85
},
{
"intent": "welcome",
"intent_id": 2,
"confidence": 0.15
}
],
"entities": [
{
"start": 6,
"end": 12,
"value": "Ovation",
"entity": "organisation",
"confidence": 0.76
}
]
}
This endpoint invokes the predict()
method of all the Blocks in the Pipeline
More details on Blocks and Pipeline in the following sections
A pipeline is built up of Blocks. Blocks can be run in two modes. (1) Train and (2) Infer. In the Train mode a Block needs to be trained given some input data, and in Infer some inference needs to be made on a query or some information needs to be extracted from the users' query. Blocks are ideally Classes which implement the following methods,
-
preinit()
: Initialize or load all the files and data structures that will be required throughout the Block. -
train()
: When called, should train a model (if required) which will later be used to make some inferences on the users' query or extract some information out of it. -
save()
: Save the trained model (if any) to disk. -
cleanup()
: release resources (if any). -
infer()
: When called, should make an inference on the user's query and return the inference.
In the train mode the Block receives some data and it calles the train()
, save()
, and cleanup()
methods one after the other. This makes sure that a new model has been trained and persisted for future use.
In the Infer Mode the infer()
method of the Block is called and the output is collected before moving on to the next Block.
Every Block has three phases in its life cycle.
-
initializing
:preinit()
-
is_training
:train()
,save()
,cleanup()
-
is_inferring
infer()
The methods that are called during this phase are mentioned on the right. These phases are invoked at specific phases of a Pipeline's life cycle (explained in the section below).
A Pipeline is a sequence of Blocks. Like Blocks, a Pipeline also has a life cycle with three phases, which are
-
initializing
: In which thepreinit()
of all the Blocks are called. -
is_training
: In this phase, all the Blocks are run in their Train mode -
is_inferring
: In this phase, all the Blocks are run in their Infer mode
In the architecture diagram, the numbers in yellow circles show the sequence in which the methods will be called. This should help you connect the dots and understand the idea of Pipelines and Blocks better.
Keep in mind that the Ovation-CI framework needs to be easily extensible. So, Pipelines should be ideally defined in a config file and all the Blocks should be loaded at runtime. E.g., if you decide to create a config file called config.json
you can use a list to keep the Class names of all the Blocks. This list could be used in your code to load individual Blocks.
{
"pipeline": ["IntentCliassifier", "EntityRecognizer", "SentimentAnalyzer", "ResponseBuilder"]
}
Note that we have not defined the method signatures, data-structures and the data flow. These things are subject to change as they will depend a lot on how you wish to implement Ovation-CI. The other modules that we have defined can be changed too if you have a new innovative idea to implement them. Keep in mind that this is just to help you get the idea and not to force you to implement something exactly the same.
Intent classification is usually the first component of any Conversational Intelligence framework. The second task that we are going to work on at OSA-alpha is Intent Classification. We do not differentiate the task of Semantic Text Similarity (STS) and Intent Classification. In Semantic Text Similarity, one generates a similarity score of two given texts. This idea can be extended and a new model can be trained to perform Intent Classification. For this task, the following is expected of the participants,
-
Create a set of intents using the rasa-nlu-trainer. Keep in mind that our goal is to train our models with as less data as possible.
-
Train an Intent Classifier using the intents that you created.
-
Deploy your model as a server
If you want to refer to some template STS models that you can use out of the box, then you can find them here. If you want to know how to use existing templates then you can refer to the Using Templates page.
The Intent Classification Server that you will deploy should have the following endpoints
-
/preinit: This should contain all the boilerplate code like loading the model, building the data structures, etc. It should send a
{"status": 1}
on success and{"status": 0}
on failure as a response. -
/train: This should contain the code to train a new model given input data in the following format:
{
"data": [
{
"text": "yes",
"intent": "affirm",
"entities": []
},
{
"text": "yep",
"intent": "affirm",
"entities": []
},
{
"text": "yeah",
"intent": "affirm",
"entities": []
},
{
"text": "Techniker Krankenkasse offices in Kaiserslautern",
"intent": "inquiry",
"entities": [
{
"start": 34,
"end": 48,
"value": "Kaiserslautern",
"entity": "location"
},
{
"start": 0,
"end": 22,
"value": "Techniker Krankenkasse",
"entity": "organisation"
}
]
}
]
}
This endpoint should send {"status": 1}
as a response when the training is over, and {"status": 0}
if some error occurred or the model could not be trained for some reason. You should ideally add a status message as well. We leave everything else to your creativity.
-
/save: This endpoint should save the current model that is being used. Similar to /train, it should also send
{"status": 1}
on success and{"status": 0}
on failure as a response message. -
/cleanup: This method should implement releasing of resources like, cleaning the memory, deleting files, persisting datastructures, etc. It should send
{"status": 1}
on success and{"status": 0}
on failure as a response message. -
/infer: This should implement the model inference code. E.g., given data as input like
{"data": "hello Ovation"}
it should return all possible intents with their confidence scores. E.g.,
{
"query": "hello Ovation",
"intents": [
{
"intent": "greetings",
"intent_id": 1,
"confidence": 0.85
},
{
"intent": "welcome",
"intent_id": 2,
"confidence": 0.15
}
],
"status": 1,
"message": "success"
}
On failure, this should return {"status": 0}
Given a user's query we want to extract the Named Entities from them. This task has the following requirements,
-
Train a Named Entity Recognition (NER) model using an NER dataset (English/German). You will find more information about the datasets supported by the Ovation framework here.
-
Deploy your model as a server
If you want to refer to some template NER models that you can use out of the box, then you can find them here. If you want to know how to use existing templates then you can refer to the Using Templates page.
The Entity Recognition Server that you will deploy should have the following endpoints,
-
/preinit: This should contain all the boilerplate code like loading the model, building the data structures, etc. It should send a
{"status": 1}
on success and{"status": 0}
on failure as a response. -
/train: This should contain the code to train a new model given input data in the following format:
{
"data": [
{
"text": "Techniker Krankenkasse offices in Kaiserslautern",
"intent": "inquiry",
"entities": [
{
"start": 34,
"end": 48,
"value": "Kaiserslautern",
"entity": "location"
},
{
"start": 0,
"end": 22,
"value": "Techniker Krankenkasse",
"entity": "organisation"
}
]
}
]
}
This endpoint should send {"status": 1}
on success and {"status": 0}
on failure as a response. You should ideally add a status message as well. We leave everything else to your creativity.
-
/save: This endpoint should save the current model that is being used. Similar to /train, it should also send
{"status": 1}
on success and{"status": 0}
on failure as a response message. -
/cleanup: This method should implement releasing of resources like, cleaning the memory, deleting files, persisting datastructures, etc. It should send
{"status": 1}
on success and{"status": 0}
on failure as a response message. -
/infer: This should implement the model inference code. E.g., given data as input like
{"data": "hello Ovation"}
it should return all possible entities, their location (start and end indices) in the query and their confidence scores. E.g.,
{
"query": "hello Ovation",
"entities": [
{
"start": 6,
"end": 12,
"value": "Ovation",
"entity": "organisation",
"confidence": 0.76
}
]
}
On failure, this should return {"status": 0}
The third core component that we will focus on in OSA-alpha is Sentiment Analysis. For this task, we not just want to generate a sentiment score given an input query. What we are also interested in is to give a cause or give a reason for why did the model give this score. In this task, you have the following requirements,
-
Train a Sentiment Analysis model using a Sentiment Analysis dataset in English/German. You will find more information about the datasets supported by the Ovation framework here.
-
Deploy your model as a server
If you want to refer to some template Sentiment Analysis models that you can use out of the box, then you can find them here. If you want to know how to use existing templates then you can refer to the Using Templates page.
The Sentiment Analysis Server that you will deploy should have the following endpoints,
-
/preinit: This should contain all the boilerplate code like loading the model, building the data structures, etc. It should send a
{"status": 1}
on success and{"status": 0}
on failure as a response. -
/train: This should contain the code to train a new model given input data in the following format:
{
"data": [
{
"text": "Thanks a lot. Good Job Ovation",
"sentiment": 1,
}
]
}
The sentiment parameter should be a value between 0 to 1.
This endpoint should send {"status": 1}
on success and {"status": 0}
on failure as a response. You should ideally add a status message as well. We leave everything else to your creativity.
-
/save: This endpoint should save the current model that is being used. Similar to /train, it should also send
{"status": 1}
on success and{"status": 0}
on failure as a response message. -
/cleanup: This method should implement releasing of resources like, cleaning the memory, deleting files, persisting datastructures, etc. It should send
{"status": 1}
on success and{"status": 0}
on failure as a response message. -
/infer: This should implement the model inference code. E.g., given data as input like
{"data": "I do not like your suggestion"}
it should return a sentiment score between 0 and 1, a coarse sentiment value ("neutral", "positive", "negative"), cause/reason for generating this score ("not like" in the example) and the location in the text of the causes (if applicable). E.g.,
{
"query": "I do not like your suggestion",
"sentiment": {
"score": 0.1,
"coarse": "neutral",
"cause": [
{
"text": "do not like your suggestion",
"start": 2,
"end": 28
}
]
}
}
On failure, this should return {"status": 0}
Three Business use cases will be presented at OSA-alpha and the task will be to develop a demo for each of the three use cases. Demos for three business use-cases can be developed using some existing open-source frameworks like Rasa. The goal is to integrate Ovation-CI in the end with the demos. Keep in mind that if you integrate with Rasa then you might want to have a module that makes it easier to integrate with Ovation-CI too.
Implement the above three use cases as demos by using the Ovation-voice API. These demos should be pure voice based. The details of the use cases will be shared at OSA-alpha