Skip to content

Latest commit

 

History

History
24 lines (18 loc) · 785 Bytes

README.md

File metadata and controls

24 lines (18 loc) · 785 Bytes

NMT

Implementing neural machine translation models for COMP 598 project
The project is divided to two main phases:
Phase 1: Developing Word Embedding Models
Phase 2: Developing Neural Machine Translation Models

Word Embedding

Word2Vec

You can download the english text corpus from here: http://mattmahoney.net/dc/textdata
Copy text8.txt to "dataset/" and run the "word2vec.ipynb" python notebook.
Training logs for 100,200 and 300 vector sizes can be found in the project.
Validation has been done for the word "eight" for these models.
You can see the models that we have trained at "save/".

Neural Machine Translation

Develop neural machine translation models

Part 1

Develop encoder-decoder LSTM

Part 2

Develop Transformer model with attention