PyTorch Transformer

A simple clean-readable and shape-annotated implementation of Attention is All You Need in PyTorch. A sample onnx file can be found in assets/transformer.onnx for visualization purposes.

It was tested on synthetic data, try to use the attention plots to figure out the transformation used to create the data!

Implementation Details

Positional Embeddings not included, similar to nn.Transformer but you can find an implementation in usage.ipynb.
Parallel MultiHeadAttention outperforms the for loop implementation significantly, as expected.
Assumes batch_first=True input by default and cna't be changed.
Uses einsum for attention computation rather than bmm for readability, this might impact performance.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
transformer		transformer
.gitignore		.gitignore
README.md		README.md
mha-analysis.ipynb		mha-analysis.ipynb
usage.ipynb		usage.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch Transformer

Implementation Details

About

Languages

vnnm404/pytorch-transformer

Folders and files

Latest commit

History

Repository files navigation

PyTorch Transformer

Implementation Details

About

Topics

Resources

Stars

Watchers

Forks

Languages