From a7d5a70fe1fd18cbba1b4f3b476bd7683922908e Mon Sep 17 00:00:00 2001 From: akki2825 Date: Wed, 30 Oct 2024 14:16:47 +0100 Subject: [PATCH] edit week 4 slide --- 2024/weeks/week04/slides.qmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/2024/weeks/week04/slides.qmd b/2024/weeks/week04/slides.qmd index 84d3407..8b0a0ee 100644 --- a/2024/weeks/week04/slides.qmd +++ b/2024/weeks/week04/slides.qmd @@ -89,7 +89,9 @@ format: ## Positional Encoding {.smaller} - Positional encoding is added to the input embeddings to give the model information about the position of each word in the sequence. -- The positional encoding in the current transformer is implemented using sine function of different frequencies. +- Unlike humans who naturally read from left to right, the transformer needs a special way to understand that "word 1 comes before word 2." +- The positional encoding in the original transformer is implemented using sine function of different frequencies. +- Using sine waves makes it easier for the transformer to understand both nearby and far-apart relationships between words. (It's similar to how music uses different frequencies to create unique sounds.) - The positional encoding vectors have the same dimensions as the embedding vectors and are added element-wise to create the input representation for each character. - This allows the model to differentiate between words based on their position in the sequence.