GitHub - oshapio/minimal-transformer: Minimal re-implementation of the original Transformer architecture from "Attention Is All You Need" (2017) paper in Pytorch

Minimal-transformer

Dependence-free original Transformer [1] re-implementation. The main goal here was to understand the inner-workings of the architecture rather than to build a production-ready efficient system. As such, I focus on simplicity and ease of understanding, which hopefully serve useful for others that want to see how the information flows in the architecture without the need to spend much effort processing data.

Two modes of usage are available (controlled by trainer.py):

Classification mode (task = classification). Here only the encoder is used, whose representations are averaged out and a classification is made. I consider a task to classify whether the first element in the sequence is the identical to the last one.
Seq2seq mode (task = seq2seq). This is the task considered in the original paper, and is way more involved than the clasification one. I consider a sequence reversal task.

Some TODOs:

Properly test the model (especially the sequence-decoder module)
Add dropout support
Add support for output sequences that differ in feature dimensionality and length from the input sequences.
Add weight multiplication of the Linear layer.

[1] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
sanity_checks		sanity_checks
utils		utils
.gitignore		.gitignore
README.md		README.md
data.py		data.py
trainer.py		trainer.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Minimal-transformer

About

Releases

Packages

Languages

oshapio/minimal-transformer

Folders and files

Latest commit

History

Repository files navigation

Minimal-transformer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages