Skip to content

Ovation by Example

Ayushman Dash edited this page Sep 13, 2017 · 8 revisions

Introduction

We realised that when working on NLU tasks, most of our time is invested in the data preprocessing, creating a vocabulary, pre loading your own word embeddings with pre-trained word embeddings, etc. It takes a lot of time to reach a point where you can just build your cool architecture and train a new model. With existing meta frameworks for Deep Learning like Keras, TFLearn, etc. it sometimes takes a few minutes to write the code for the model architecture. It is the data preprocessing that is a nightmare. To make your life easier and help you solve problem in Conversational Intelligence (CI), we developed a framework that would not just help you in building Deep Learning models for CI faster, but also provide you with templates that you can use out of the box to build and train your own model architectures. This page walks you through an example and shows how you can use the Ovation-CI framework to develop models from scratch. We will start with a very basic example of A Tweet Emotion Classification and then show another example in which we will train a BLSTM Regression Model to generate a sentiment score given a review in German.

Requirements

It is recommended to complete the basic requirements mentioned in the How to prepare for OSA section in the Home Page.

1. A Tweet Emotion Classifier

Let us build a Tweet Emotion Classifier from scratch.

Imports

Import the reqired packages

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation
from keras.layers import Embedding
from keras.layers import LSTM
from keras.layers import Conv1D
from keras.layers import MaxPooling1D
from datasets import TwitterEmotion

Instantiate a Dataset Object

Create an object of a Dataset Class and get the preloaded word2vec matrix corresponding to its vocabulary. Then, open the train, validation and test splits.

# Instantiate a TwitterEmotion dataset object
te = TwitterEmotion()

# get the preloaded word2vec matrix for its vocabulary
w2v = te.w2v

# open the dataset for train, validation and test
te.train.open(fold=0)
te.validation.open(fold=0)
te.test.open(fold=0)

Set the Hyper-Parameters

Set all the hyper-parameters required for training your model.

# Set some hyper-parameters

vocab_size = te.vocab_size
maxlen = 30
embedding_size = w2v.shape[-1]

# Convolution
kernel_size = 5
filters = 64
pool_size = 4

# LSTM
lstm_output_size = 70

# Training
batch_size = 30
epochs = 2

Build The Model

Build your Model using the Keras Sequential API. Note that the Embedding Layer is initialized with the preloaded word2vec from the Dataset object.

# Build the model using Keras Sequential API
print('Building the Model...')
model = Sequential()
model.add(Embedding(vocab_size, embedding_size, input_length=maxlen, 
                    weights=[w2v]))
model.add(Dropout(0.25))
model.add(Conv1D(filters,
                 kernel_size,
                 padding='valid',
                 activation='relu',
                 strides=1))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(lstm_output_size))
model.add(Dense(te.n_classes))
model.add(Activation('sigmoid'))

# Compile the model for the classification task
model.compile(loss='categorical_crossentropy',
              optimizer='adam', metrics=['accuracy'])

Start Training

# Boilerplate to keep track of the minimum validation loss and the previous epoch
min_val_loss = float("inf")
prev_epoch = 0

# Train till the max epocs has been reached
while te.train.epochs_completed < epochs:
    print('Training epoch {}'.format(te.train.epochs_completed))

    # Sample a training batch from the dataset object with text padded to a max length
    # Each instance in the batch will have all the Named Entities marked. E.g., in USA -> in BOE USA EOE,
    # where BOE is a token for Begin of Entity and EOE is a token for End of Entity. The target classes will be 
    # one-hot encoded
    train_batch = te.train.next_batch(batch_size=batch_size, pad=maxlen, 
                                      one_hot=True, mark_entities=True)

    # Train the batch
    [loss, accuracy] = model.train_on_batch(train_batch.text,  
                                            train_batch.emotion)

Keep Track of the Validation Loss

It is a good practice to check the validation loss from time to time (one epoch in this case). To do so, we will extend the previous section with the following code.

 Boilerplate to keep track of the minimum validation loss and the previous epoch
min_val_loss = float("inf")
prev_epoch = 0

# Train till the max epocs has been reached
while te.train.epochs_completed < epochs:
    print('Training epoch {}'.format(te.train.epochs_completed))

    # Sample a training batch from the dataset object with text padded to a max length
    # Each instance in the batch will have all the Named Entities marked. E.g., in USA -> in BOE USA EOE,
    # where BOE is a token for Begin of Entity and EOE is a token for End of Entity. The target classes will be 
    # one-hot encoded
    train_batch = te.train.next_batch(batch_size=batch_size, pad=maxlen, 
                                      one_hot=True, mark_entities=True)

    # Train the batch
    [loss, accuracy] = model.train_on_batch(train_batch.text,  
                                            train_batch.emotion)
    
    # On epoch complete, validate once on the validation set
    if prev_epoch != te.train.epochs_completed:
        prev_epoch = te.train.epochs_completed
        
        print('validating')
        total_val_loss, n_val_iterations = 0.0, 0

        # Validate on one entire epoch of the validation set
        while te.validation.epochs_completed < 1:
            te.validation._epochs_completed = 0
            
            # Sample a validation batch with the same params as train batch
            val_batch = te.validation.next_batch(batch_size=batch_size, 
                                  pad=maxlen, one_hot=True, mark_entities=True)
            [val_loss, val_accuracy] = model.test_on_batch(val_batch.text, 
                                                                val_batch.emotion)

            # Keep track of the loss
            total_val_loss += val_loss
            n_val_iterations += 1
        
        # Calculate the average validation loss
        avg_val_loss = total_val_loss/n_val_iterations
        print("Average Validation Loss: {}".format(avg_val_loss))
        if avg_val_loss < min_val_loss:
            print('saving model as the validation loss improved. '
                  'Previous val loss: {}\t current val loss: {}'.format(
                    min_val_loss, avg_val_loss))
            model.save('model_{}.h5'.format(te.train.epochs_completed))
            # Change the minimum validation loss to the current validation loss
            min_val_loss = avg_val_loss

Finally Test your Model

Once you have finished training the model you need to test it with the test data. The following code will iterate over the test set and generate some results.

# Test the model on the test set
print('Testing')

# Some boilerplate for testing
total_test_loss, total_test_acc, n_test_iterations = 0.0, 0.0, 0
# Test on an entire epoch
while te.test.epochs_completed < 1:
    te.test._epochs_completed = 0
    # Sample a test batch
    test_batch = te.test.next_batch(batch_size=batch_size, 
                          pad=maxlen, one_hot=True, mark_entities=True)
    [test_loss, test_accuracy] = model.test_on_batch(test_batch.text, 
                                                        test_batch.emotion)
    total_test_loss += test_loss
    total_test_acc += test_accuracy
    n_test_iterations += 1

avg_test_loss = total_test_loss/n_test_iterations
avg_test_acc = total_test_acc/n_test_iterations
print("Avg Test Accuracy: {}\nAverage Test Loss: {}".format(avg_test_acc, 
                                                            avg_test_loss))

2. A Sentiment Analysis Model

To Train a Sentiment Analysis model we will use a template provided in OSA-CI. You can refer to the source code of the template here, and the Model here.

python templates/sentiment_analysis_regression.py --dataset=amazon_de --rnn_layers=3 --n_filters=300 --sequence_length=120 --batch_size=128 --gpu_fraction=0.4

This script will train a CNN BLSTM Sentiment analysis model on the Amazon Reviews German dataset. You can easily extend this template by adding your own functions or use the template and change the Model that it uses with your own. To understand how to write your own model, refer to the Build Your Own Model page. If you want to know how to use the existing templates, then refer to the Using Templates page.