A simple, experimental chatbot implementation using Deep Learning, implemented in Python.
Current functionalities include:
Intent Classification [Using a Bidirectional LSTM based RNN - Keras with Tensorflow backend]
Entity Recognition [Using the Spacy NLP library for Python]
Dependency Tree Display [Using the Spacy NLP Library for Python]
Response Generation [Random response generation using Stacked LSTM based RNN - Keras with Tensorflow backend]
You will need a working installation of Python 3.5. (Preferrably on Anaconda - Installation Guide)
The other requirements(Python Libraries) can be found here.
retrain_ner.py - Official sample script to retrain existing Spacy model's NER
dependency_tree.py - Contains functions for parsing and printing dependency trees using Spacy and NLTK
preprocessor.py - Contains functions for loading datasets for intent classification, padding sentence vectors and creating one hot vectors for class labels
intent_train.py - Trains model to detect intents in classes provided.
intent_predict.py - Predicts intent of test queries, recognizes entities and displays the dependency tree
corpuscleaner.py - When beginning to work with a new corpus, run the text file through corpuscleaner.py. This will remove all numbers, the words "chapter" and "book", and any additional strings specified by the user in the file.
response_train.py - Trains a model to generate random text in the style of data provided.
response_generate.py - Generates random response based on user input seed.
backup/ - Generated models are saved to their respective folder here.
data/text_generation/ - Contains a number of possible corpuses as plain text files.
data/intent_classes/ - Contains text files, each with numerous sentences corresponding to a particular class of intent.
Provide your training datasets in the data/ folder.
Format:
-
For intent classification:
Each text file contains one sentence per line that is relevant to one particular class of intent. The textfile must be named after the intent that its sentences represent. The names of these text files will then be considered as the classes of intents, later used for training and prediction. The classification model uses a single bidirectional LSTM cell, followed by a dropout layer, with an Adam optimizer. Spacy's default pretrained vectorizer model for English is currently used for word embedding. Future commits will implement this model for more accurate results.
The current dataset currently provided in data/intent_classification/ is small, but more data can be added or new training datasets can be used. The preprocessor.py file can be adjusted accordingly to read training input.
-
For text generation:
Provide as much dialogue data as possible in a text file, as a sample for the type of text you want to generate. The current files provided include a variety of training data, each of which will result in a wildly different model, with respect to the style of responses generated.The current implementation of the response generation phase is a very basic word sequence generator, that uses 2 stacked LSTM cells, each followed by a dropout layer.
Much larger training relevant corpuses are required in order to train a robust, domain-specific chatbot. Best trained on a GPU, because this process is computation intensive.(Note: If you are using a GPU, use the Theano backend with Keras by changing the 'backend' attribute in your .keras/keras.json file to theano. You may also need to install the package mkl-service on Linux systems. )
The current implementation is in the form of a word - level RNN. A sequence level model would probably be more appropriate and will be incorporated in future commits.
Further commits will incorporate the response selector pipeline, that will ensure that the responses are more relevant to the user's query.
python intent_train.py
python intent_predict.py
python response_train.py
python response_generate.py
response_train.py Based on this Keras example
corpuscleaner.py From Maia McCormick's project
retrain_ner.py Official sample script from Spacy examples
The code for this project has not been updated since September 2017 . If you have any queries on this project, you can contact me at: [email protected] .