This repo contains notebooks that explore the different LLMs from scratch
My articles here talk about the transformer architecture in detail:
https://medium.com/towards-data-science/deep-dive-into-transformers-by-hand-%EF%B8%8E-68b8be4bd813
https://medium.com/towards-data-science/deep-dive-into-self-attention-by-hand-%EF%B8%8E-f02876e49857
Work in progress!