Natural-Language-Processing

Welcome to my Natural Language Processing (NLP) diary ^_^.

What are transformers?

Transformers are a type of neural network architecture that allow for parallelization across the sequence. This means that the network can process all of the tokens in the sequence at the same time, rather than having to process them sequentially. This is a huge advantage over RNNs, which must process tokens sequentially.
It was introduced through the paper Attention Is All You Need in 2017 which can be found in the Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017). by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin
Below is a diagram of the Transformer architecture:
Sebastian Ratchka sums it well here

Fave paper so far:

This paper presents a compelling case that purported emergent abilities in LLMs are highly dependent on the metrics employed, challenging the community to reassess the foundational understanding of how LLMs evolve with scale.

First favourites in research*****************

Can large language models (LLMs) train themselves? Credits: Cameron Wolfe found through this twitter thread

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
Conversational AI		Conversational AI
Creative AI/Musicgen		Creative AI/Musicgen
Embeddings		Embeddings
Finetuning		Finetuning
LangChain		LangChain
ML_Research		ML_Research
Natural Language Generation (NLG)		Natural Language Generation (NLG)
Natural Language Understanding (NLU)		Natural Language Understanding (NLU)
.env		.env
.gitignore		.gitignore
README.md		README.md
image.png		image.png
image2.jpeg		image2.jpeg
train-your-models.jpeg		train-your-models.jpeg