For this activity follow Neural machine translation with a Transformer and Keras.
Submit completed Colab notebook showing generated output.
The data found in this tutorial is a data set from Tensor Flow of a data set full of Portuguese-English translations. This dataset has about 52,000 training sets, 1,200 validation sets, and 1,800 test cases. The program then breaks down all the words and gives them numerical value and tries to rebuild it back up to human readable text, a process called tokenizing. Lastly, it tests the data set to ensure that the data has been processed correctly.
This program uses a transformer model to train the data, and there is a lot that goes into a transformer model, so I have broken it up into categories:
Takes in both the Portuguese and English input tokens and converts them to vectors.
This part of the transfomer makes the training much faster. This part also ensures that vectors are updated by attention layers rather than replace them.
This is broken into four categories of attention layers: base attention layer, cross attention layer, global self attention layer, and casual self attention layer. The cross attention layer connects the encoder and the decoder. The global self attention layer deals with context sequences and propogating information. The self attention layer is similar to the global self attention layer, but deals with the output.
After the encoder and the decoder has been constructed, the program then gives the transformer specific parameters, and the program uses those to build a model, train a model, test the model, and the examine the results.