Skip to content

Commit

Permalink
fix reference
Browse files Browse the repository at this point in the history
  • Loading branch information
florian-huber committed Jun 24, 2024
1 parent 6c0300b commit 6ecace9
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion notebooks/24_NLP_4_ngrams_word_vectors.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2649,7 +2649,7 @@
"\n",
"A more fundamental limitation of Word2Vec and similar algorithms lies in the underlying bag-of-words approach, which removes information related to the order of words. Even constructs like n-grams can only compensate for extremely local patterns, such as differentiating \"do not like\" from \"do like\".\n",
"\n",
"In contrast, deep learning techniques like recurrent neural networks and, more powerfully, **transformers**, can learn patterns across many more words. Transformers, in particular, can learn patterns across entire pages of text {cite}`vaswani2017attention`, enabling models like ChatGPT and other large language models to use natural language with unprecedented subtlety. Models such as BERT (Bidirectional Encoder Representations from Transformers) {cite}`devlin2018bert` and GPT (Generative Pretrained Transformer) {cite}`radford2018improvin`g produce contextualized representations of words within a given context, taking the entire sentence or paragraph into account rather than generating static word embeddings.\n",
"In contrast, deep learning techniques like recurrent neural networks and, more powerfully, **transformers**, can learn patterns across many more words. Transformers, in particular, can learn patterns across entire pages of text {cite}`vaswani2017attention`, enabling models like ChatGPT and other large language models to use natural language with unprecedented subtlety. Models such as BERT (Bidirectional Encoder Representations from Transformers) {cite}`devlin2018bert` and GPT (Generative Pretrained Transformer) {cite}`radford2018improving` produce contextualized representations of words within a given context, taking the entire sentence or paragraph into account rather than generating static word embeddings.\n",
"\n",
"In conclusion, while TF-IDF and n-grams offer a solid start, word embeddings like those produced by Word2Vec and contextualized representations from transformers provide more advanced methods for working with text by considering context and semantic meaning."
]
Expand Down

0 comments on commit 6ecace9

Please sign in to comment.