Skip to content

Commit

Permalink
Add missing task links
Browse files Browse the repository at this point in the history
  • Loading branch information
Sypherd committed Nov 1, 2024
1 parent 4f8e479 commit 26c93b3
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion lm_eval/tasks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,8 @@
| medqa | Multiple choice question answering based on the United States Medical License Exams. | |
| [mgsm](mgsm/README.md) | Benchmark of multilingual grade-school math problems. | Spanish, French, German, Russian, Chinese, Japanese, Thai, Swahili, Bengali, Telugu |
| [minerva_math](minerva_math/README.md) | Mathematics-focused tasks requiring numerical reasoning and problem-solving skills. | English |
| mmlu | Massive Multitask Language Understanding benchmark for broad domain language evaluation. Several variants are supported. | English |
| [mmlu](mmlu/README.md) | Massive Multitask Language Understanding benchmark for broad domain language evaluation. Several variants are supported. | English |
| [mmlu_pro](mmlu_pro/README.md) | A refined set of MMLU, integrating more challenging, reasoning-focused questions and expanding the choice set from four to ten options. | English |
| [mmlusr](mmlusr/README.md) | Variation of MMLU designed to be more rigorous. | English |
| model_written_evals | Evaluation tasks auto-generated for evaluating a collection of AI Safety concerns. | |
| [mutual](mutual/README.md) | A retrieval-based dataset for multi-turn dialogue reasoning. | English |
Expand Down

0 comments on commit 26c93b3

Please sign in to comment.