Built site for gh-pages

etalab · Nov 7, 2024 · a24a50f · a24a50f
1 parent 8b46fa2
commit a24a50f
Show file tree

Hide file tree

Showing 5 changed files with 59 additions and 29 deletions.
diff --git a/.nojekyll b/.nojekyll
@@ -1 +1 @@
-f1e7062a
+ec8b97c3
diff --git a/Bibliographie.html b/Bibliographie.html
@@ -339,6 +339,36 @@ <h3 class="anchored" data-anchor-id="articles-de-recherche-centraux">Articles de
 <li><a href="https://arxiv.org/abs/2312.16171">Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4</a></li>
 <li><a href="https://arxiv.org/pdf/2308.09687">Graph of Thoughts</a></li>
 </ul>
+<p><strong>Evaluation (métriques)</strong></p>
+<table class="caption-top table">
+<thead>
+<tr class="header">
+<th>Basée sur embeddings</th>
+<th>Basée sur modèle fine-tuné</th>
+<th>Basé sur LLM</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td><a href="https://arxiv.org/abs/1904.09675">BERTScore</a></td>
+<td><a href="https://arxiv.org/abs/2210.07197">UniEval</a></td>
+<td><a href="https://arxiv.org/abs/2303.16634">G-Eval</a></td>
+</tr>
+<tr class="even">
+<td><a href="https://arxiv.org/abs/1909.02622">MoverScore</a></td>
+<td><a href="https://www.patronus.ai/blog/lynx-state-of-the-art-open-source-hallucination-detection-model">Lynx</a></td>
+<td><a href="https://arxiv.org/abs/2302.04166">GPTScore</a></td>
+</tr>
+<tr class="odd">
+<td></td>
+<td><a href="https://github.com/prometheus-eval/prometheus-eval">Prometheus-eval</a></td>
+<td></td>
+</tr>
+</tbody>
+</table>
+<p><strong>Evaluation (frameworks)</strong> - <a href="https://github.com/explodinggradients/ragas">Ragas</a> (spécialisé pour le RAG) - <a href="https://github.com/stanford-futuredata/ARES">Ares</a> (spécialisé pour le RAG) - <a href="https://github.com/Giskard-AI/giskard">Giskard</a> - <a href="https://github.com/confident-ai/deepeval">DeepEval</a></p>
+<p><strong>Evaluation (RAG)</strong> - <a href="https://arxiv.org/abs/2405.07437">Evaluation of Retrieval-Augmented Generation: A Survey</a> - <a href="https://arxiv.org/abs/2405.13622">Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation</a></p>
+<p><strong>Evaluation (divers)</strong> - <a href="https://arxiv.org/abs/2311.03754">Prompting strategies for LLM-based metrics</a> - <a href="https://arxiv.org/abs/2402.01383">LLM-based NLG Evaluation: Current Status and Challenges</a> - <a href="https://arxiv.org/abs/2306.05685">Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena</a></p>
 </section>
 <section id="librairies-et-ressources" class="level3">
 <h3 class="anchored" data-anchor-id="librairies-et-ressources">Librairies et ressources</h3>

diff --git a/IV-Exemples/2_Classification_accords_entreprise.html b/IV-Exemples/2_Classification_accords_entreprise.html
@@ -347,7 +347,7 @@ <h3 class="anchored" data-anchor-id="récupération-des-données">Récupération
 <p>Les données sont disponibles sur <a href="https://www.legifrance.gouv.fr/search/acco?tab_selection=acco&amp;searchField=ALL&amp;query=%2A&amp;searchType=ALL&amp;typePagination=DEFAULT&amp;sortValue=PERTINENCE&amp;pageSize=25&amp;page=1#acco">Légifrance</a>. Le stock des textes est également publié par le <a href="https://echanges.dila.gouv.fr/OPENDATA/ACCO/">FTP de la DILA</a> et les thématiques déclarées sont à la fois dans les métadonnées XML publié conjointement avec les textes, ou retrouvables sur Légifrance.</p>
 <p>Pour des raisons pratiques, nous travaillerons avec une <a href="https://minio.lab.sspcloud.fr/cthiounn2/Accords/10p_accords_publics_et_thematiques_240815.parquet">photographie du stock au 1er semestre 2024</a> et sur un <a href="https://minio.lab.sspcloud.fr/cthiounn2/Accords/10p_accords_publics_et_thematiques_240815_sample_of_1000.parquet">échantillon des 1000 textes d’accords</a>, convertis au format parquet.</p>
 <p>Ces données comportent le numéro de dossier de l’accord, identifiant unique, puis le texte et les thématiques déclarées, et enfin suivies des thématiques une à une :</p>
-<div id="332edfb8" class="cell" data-execution_count="1">
+<div id="e73c8f91" class="cell" data-execution_count="1">
 <div class="cell-output cell-output-display">
 <div>
 
@@ -416,7 +416,7 @@ <h4 class="anchored" data-anchor-id="lecture-de-données">Lecture de données</h
 jupyter==1.1.1
 ipykernel==6.29.5</code></pre>
 <p>Nous allons dans cet exemple, extraire 10 textes pour des raisons de rapidité :</p>
-<div id="e51c3c30" class="cell" data-execution_count="2">
+<div id="fa11ab03" class="cell" data-execution_count="2">
 <div class="sourceCode cell-code" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> json</span>
 <span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np</span>
 <span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> pandas <span class="im">as</span> pd</span>
@@ -445,7 +445,7 @@ <h4 class="anchored" data-anchor-id="lecture-de-données">Lecture de données</h
 <section id="vectoriser-nos-textes-avec-chromadb" class="level4">
 <h4 class="anchored" data-anchor-id="vectoriser-nos-textes-avec-chromadb">Vectoriser nos textes avec ChromaDB</h4>
 <p>Pour vectoriser nos textes, nous utilisons ChromaDB qui s’intègre avec Langchain. Nous allons découper en morceau des 3000 caractères à chaque saut à ligne, ce qui correspond à un paragraphe. Les morceaux de textes, ici paragraphes, sont stockés dans une boutique de vecteur avec le numéro de dossier et le numéro de paragraphe en métadonnées.</p>
-<div id="5bab0767" class="cell" data-execution_count="3">
+<div id="1d2d3416" class="cell" data-execution_count="3">
 <div class="sourceCode cell-code" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>text_splitter <span class="op">=</span> CharacterTextSplitter(</span>
 <span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>    separator<span class="op">=</span><span class="st">"</span><span class="ch">\n\n</span><span class="st">"</span>,</span>
 <span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>    chunk_size<span class="op">=</span><span class="dv">3000</span>,</span>
@@ -472,7 +472,7 @@ <h4 class="anchored" data-anchor-id="vectoriser-nos-textes-avec-chromadb">Vector
 <section id="interroger-un-llm-en-mode-api" class="level4">
 <h4 class="anchored" data-anchor-id="interroger-un-llm-en-mode-api">Interroger un LLM en mode API</h4>
 <p>Pour interroger le LLM, nous construisons une classe qui permet de générer les requêtes et de traiter les réponses :</p>
-<div id="034395fa" class="cell" data-execution_count="4">
+<div id="475dfd2f" class="cell" data-execution_count="4">
 <div class="sourceCode cell-code" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>MODEL<span class="op">=</span><span class="st">"llama3.1"</span></span>
 <span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a></span>
@@ -493,7 +493,7 @@ <h4 class="anchored" data-anchor-id="interroger-un-llm-en-mode-api">Interroger u
 <span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a>    llm <span class="op">=</span> LocalOllamaLLM(api_url<span class="op">=</span><span class="st">"http://127.0.0.1:11434"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Nous définissons également un prompt de base, améliorable par la suite, et une chaîne LangChain entre le prompt et le LLM :</p>
-<div id="87686ce8" class="cell" data-execution_count="5">
+<div id="8750875f" class="cell" data-execution_count="5">
 <div class="sourceCode cell-code" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>system_prompt <span class="op">=</span> (</span>
 <span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a>    <span class="st">" Répondez à la question posée "</span></span>
 <span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a>    <span class="st">" Utilisez le contexte (sélection des meilleurs paragraphes liés à la question) donné pour répondre à la question "</span></span>
@@ -510,7 +510,7 @@ <h4 class="anchored" data-anchor-id="interroger-un-llm-en-mode-api">Interroger u
 <span id="cb8-14"><a href="#cb8-14" aria-hidden="true" tabindex="-1"></a>question_answer_chain <span class="op">=</span> create_stuff_documents_chain(llm, prompt)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 <p>Nous définissons une fonction pour effectuer le RAG, avec à la fois la recherche de similarité par rapport à la question, et la soumission augmentée pour une réponse du LLM :</p>
-<div id="8d4c785f" class="cell" data-execution_count="6">
+<div id="088a55eb" class="cell" data-execution_count="6">
 <div class="sourceCode cell-code" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="kw">def</span> search_and_invoke_llm(vector_store,index,query,k<span class="op">=</span><span class="dv">5</span>):</span>
 <span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> k<span class="op">==</span><span class="dv">0</span>:</span>
 <span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a>        <span class="bu">print</span>(<span class="ss">f"bug with </span><span class="sc">{</span>index<span class="sc">}</span><span class="ss">"</span>)</span>
@@ -535,7 +535,7 @@ <h4 class="anchored" data-anchor-id="interroger-un-llm-en-mode-api">Interroger u
 <section id="automatiser-la-classification-sur-lensemble-des-thématiques" class="level4">
 <h4 class="anchored" data-anchor-id="automatiser-la-classification-sur-lensemble-des-thématiques">Automatiser la classification sur l’ensemble des thématiques</h4>
 <p>Nous automatisons ici la classification sous forme de classification binaire pour chaque thématique, en posant une question “oui ou non” et en inférant oui si la réponse commence par oui, non sinon.</p>
-<div id="dae50f88" class="cell" data-execution_count="7">
+<div id="3dbb0a12" class="cell" data-execution_count="7">
 <div class="sourceCode cell-code" id="cb10"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a>THEMATIQUES<span class="op">=</span>{</span>
 <span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a>    <span class="st">"accord_methode_penibilite"</span>:<span class="st">"Accords de méthode (pénibilité)"</span>,</span>
 <span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a><span class="st">"accord_methode_pse"</span>:<span class="st">"Accords de méthode (PSE)"</span>,</span>
@@ -626,7 +626,7 @@ <h4 class="anchored" data-anchor-id="automatiser-la-classification-sur-lensemble
 <section id="evaluation" class="level3">
 <h3 class="anchored" data-anchor-id="evaluation">Evaluation</h3>
 <p>Nous évaluons les performances de cette solution simple, en affichant la matrice de confusion et les différentes métriques, pour chaque thématique :</p>
-<div id="a0b3e232" class="cell" data-execution_count="8">
+<div id="9ba770f5" class="cell" data-execution_count="8">
 <div class="sourceCode cell-code" id="cb11"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np</span>
 <span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.metrics <span class="im">import</span> confusion_matrix</span>
 <span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> sklearn.metrics <span class="im">import</span> accuracy_score, precision_score, recall_score, f1_score, classification_report</span>

diff --git a/search.json b/search.json
@@ -286,7 +286,7 @@
     "href": "Bibliographie.html#ii---développement",
     "title": "Bibliographie",
     "section": "II - Développement",
-    "text": "II - Développement\n\nPlateforme de partage de modèles\n\nHuggingFace\n\n\n\nArticles de recherche centraux\nTransformers\n\nPapier original ‘Attention Is All You Need’\nExplication illustrée et très détaillée\nLes différents types de modèles\nLes Mixture of Experts\n\nFine-tuning\n\nLoRA\nQLoRA\nDoRA\nIntroduction au RLHF\nDPO\nKTO\n\nBonnes pratiques du prompt engineering\n\nPrincipled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4\nGraph of Thoughts\n\n\n\nLibrairies et ressources\nLLM platform - Ollama\nPipelines et orchestration LLM - LangChain - LlamaIndex - Haystack\nRAG - Graph RAG\nEvaluation - SelfCheckGPT",
+    "text": "II - Développement\n\nPlateforme de partage de modèles\n\nHuggingFace\n\n\n\nArticles de recherche centraux\nTransformers\n\nPapier original ‘Attention Is All You Need’\nExplication illustrée et très détaillée\nLes différents types de modèles\nLes Mixture of Experts\n\nFine-tuning\n\nLoRA\nQLoRA\nDoRA\nIntroduction au RLHF\nDPO\nKTO\n\nBonnes pratiques du prompt engineering\n\nPrincipled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4\nGraph of Thoughts\n\nEvaluation (métriques)\n\n\n\nBasée sur embeddings\nBasée sur modèle fine-tuné\nBasé sur LLM\n\n\n\n\nBERTScore\nUniEval\nG-Eval\n\n\nMoverScore\nLynx\nGPTScore\n\n\n\nPrometheus-eval\n\n\n\n\nEvaluation (frameworks) - Ragas (spécialisé pour le RAG) - Ares (spécialisé pour le RAG) - Giskard - DeepEval\nEvaluation (RAG) - Evaluation of Retrieval-Augmented Generation: A Survey - Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation\nEvaluation (divers) - Prompting strategies for LLM-based metrics - LLM-based NLG Evaluation: Current Status and Challenges - Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena\n\n\nLibrairies et ressources\nLLM platform - Ollama\nPipelines et orchestration LLM - LangChain - LlamaIndex - Haystack\nRAG - Graph RAG\nEvaluation - SelfCheckGPT",
     "crumbs": [
       "Bibliographie"
     ]

diff --git a/sitemap.xml b/sitemap.xml
@@ -2,78 +2,78 @@
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/IV-Exemples/2_Classification_accords_entreprise.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/I-Accompagnement/4_Impacts.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/I-Accompagnement/2_Deja_Fait_Admin.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/II-Developpements/4_Evaluations.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/II-Developpements/0_Introduction.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/II-Developpements/3_RAG.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/notebooks/10p_RAG_OLLAMA.html</loc>
-    <lastmod>2024-11-07T15:43:22.880Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.528Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/III-Deploiements/1_Socle_minimal.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/III-Deploiements/3_Socle_Production.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/Reste_a_faire.html</loc>
-    <lastmod>2024-11-07T15:43:22.868Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/III-Deploiements/2_Socle_avance.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/III-Deploiements/4_Infras_administrations.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/Bibliographie.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/notebooks/autres/parse_llama31_results.html</loc>
-    <lastmod>2024-11-07T15:43:22.880Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.528Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/II-Developpements/1_Anatomie_LLM.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/II-Developpements/2_Utilisation_LLM.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/index.html</loc>
-    <lastmod>2024-11-07T15:43:22.880Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.528Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/I-Accompagnement/1_cas_usage.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
   <url>
     <loc>https://etalab.github.io/programme10pourcent-kallm/I-Accompagnement/3_Acculturation.html</loc>
-    <lastmod>2024-11-07T15:43:22.864Z</lastmod>
+    <lastmod>2024-11-07T15:51:20.512Z</lastmod>
   </url>
 </urlset>