Implementing neural machine translation models for COMP 598 project
The project is divided to two main phases:
Phase 1: Developing Word Embedding Models
Phase 2: Developing Neural Machine Translation Models
You can download the english text corpus from here: http://mattmahoney.net/dc/textdata
Copy text8.txt to "dataset/" and run the "word2vec.ipynb" python notebook.
Training logs for 100,200 and 300 vector sizes can be found in the project.
Validation has been done for the word "eight" for these models.
You can see the models that we have trained at "save/".
Develop neural machine translation models
Develop encoder-decoder LSTM
Develop Transformer model with attention