diff --git a/Bibliographie.qmd b/Bibliographie.qmd
index b82de84..bcf5c65 100644
--- a/Bibliographie.qmd
+++ b/Bibliographie.qmd
@@ -49,6 +49,31 @@ title: Bibliographie
 - [Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4](https://arxiv.org/abs/2312.16171)
 - [Graph of Thoughts](https://arxiv.org/pdf/2308.09687)
 
+**Evaluation (métriques)**
+
+| Basée sur embeddings | Basée sur modèle fine-tuné  | Basé sur LLM |
+|--|--|--|
+| [BERTScore](https://arxiv.org/abs/1904.09675) |[UniEval](https://arxiv.org/abs/2210.07197) | [G-Eval](https://arxiv.org/abs/2303.16634)|
+|[MoverScore](https://arxiv.org/abs/1909.02622)   | [Lynx](https://www.patronus.ai/blog/lynx-state-of-the-art-open-source-hallucination-detection-model)   |   [GPTScore](https://arxiv.org/abs/2302.04166)|
+| |  [Prometheus-eval](https://github.com/prometheus-eval/prometheus-eval)  |  |
+
+**Evaluation (frameworks)**
+- [Ragas](https://github.com/explodinggradients/ragas) (spécialisé pour le RAG)
+- [Ares](https://github.com/stanford-futuredata/ARES) (spécialisé pour le RAG)
+- [Giskard](https://github.com/Giskard-AI/giskard)
+- [DeepEval](https://github.com/confident-ai/deepeval) 
+
+**Evaluation (RAG)**
+- [Evaluation of Retrieval-Augmented Generation: A Survey](https://arxiv.org/abs/2405.07437)
+- [Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation](https://arxiv.org/abs/2405.13622)
+
+
+**Evaluation (divers)**
+- [Prompting strategies for LLM-based metrics](https://arxiv.org/abs/2311.03754)
+- [LLM-based NLG Evaluation: Current Status and Challenges](https://arxiv.org/abs/2402.01383)
+- [Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena](https://arxiv.org/abs/2306.05685)
+
+
 ### Librairies et ressources