Skip to content

Actions: EleutherAI/lm-evaluation-harness

Tasks Modified

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
2,935 workflow runs
2,935 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Add new benchmark: Galician bench
Tasks Modified #3453: Pull request #2155 synchronize by zxcvuser
September 27, 2024 15:58 Action required zxcvuser:galician_bench
September 27, 2024 15:58 Action required
Add new benchmark: Basque bench
Tasks Modified #3452: Pull request #2153 synchronize by zxcvuser
September 27, 2024 15:52 Action required zxcvuser:basque_bench
September 27, 2024 15:52 Action required
Add new benchmark: Portuguese bench
Tasks Modified #3451: Pull request #2156 synchronize by zxcvuser
September 27, 2024 15:35 4m 13s zxcvuser:portuguese_bench
September 27, 2024 15:35 4m 13s
Add new benchmark: Galician bench
Tasks Modified #3445: Pull request #2155 synchronize by zxcvuser
September 27, 2024 10:13 Action required zxcvuser:galician_bench
September 27, 2024 10:13 Action required
Add new benchmark: Basque bench
Tasks Modified #3444: Pull request #2153 synchronize by zxcvuser
September 27, 2024 10:13 Action required zxcvuser:basque_bench
September 27, 2024 10:13 Action required
Add metabench task to LM Evaluation Harness
Tasks Modified #3442: Pull request #2357 synchronize by kozzy97
September 27, 2024 07:12 2m 4s kozzy97:metabench
September 27, 2024 07:12 2m 4s
openai: better error messages; fix greedy matching (#2327)
Tasks Modified #3441: Commit 1bc6c93 pushed by haileyschoelkopf
September 26, 2024 19:58 13s main
September 26, 2024 19:58 13s
add mmlu readme (#2282)
Tasks Modified #3440: Commit 00f5537 pushed by haileyschoelkopf
September 26, 2024 19:54 1m 28s main
September 26, 2024 19:54 1m 28s
Added TurkishMMLU to LM Evaluation Harness (#2283)
Tasks Modified #3439: Commit deb4328 pushed by haileyschoelkopf
September 26, 2024 19:27 2m 17s main
September 26, 2024 19:27 2m 17s
Added TurkishMMLU to LM Evaluation Harness
Tasks Modified #3438: Pull request #2283 synchronize by haileyschoelkopf
September 26, 2024 19:22 1m 40s ArdaYueksel:turkish-mmlu
September 26, 2024 19:22 1m 40s
openai: better error messages; fix greedy matching
Tasks Modified #3437: Pull request #2327 synchronize by baberabb
September 26, 2024 19:22 10s openai
September 26, 2024 19:22 10s
openai: better error messages; fix greedy matching
Tasks Modified #3436: Pull request #2327 synchronize by baberabb
September 26, 2024 19:19 16s openai
September 26, 2024 19:19 16s
openai: better error messages; fix greedy matching
Tasks Modified #3435: Pull request #2327 synchronize by baberabb
September 26, 2024 19:18 15s openai
September 26, 2024 19:18 15s
Added TurkishMMLU to LM Evaluation Harness
Tasks Modified #3434: Pull request #2283 synchronize by haileyschoelkopf
September 26, 2024 19:11 1m 45s ArdaYueksel:turkish-mmlu
September 26, 2024 19:11 1m 45s
mmlu-pro: add newlines to task descriptions (not leaderboard) (#2334)
Tasks Modified #3433: Commit 558d0d7 pushed by haileyschoelkopf
September 26, 2024 19:08 2m 16s main
September 26, 2024 19:08 2m 16s
mmlu-pro: add newlines to task descriptions (not leaderboard)
Tasks Modified #3432: Pull request #2334 synchronize by haileyschoelkopf
September 26, 2024 19:07 1m 56s mmlupro_
September 26, 2024 19:07 1m 56s
change glianorex to test split (#2332)
Tasks Modified #3431: Commit 7d24238 pushed by haileyschoelkopf
September 26, 2024 18:57 1m 32s main
September 26, 2024 18:57 1m 32s
change group to tags in task eus_exams task configs (#2320)
Tasks Modified #3430: Commit af92448 pushed by haileyschoelkopf
September 26, 2024 18:56 1m 28s main
September 26, 2024 18:56 1m 28s
fix cost_estimate script
Tasks Modified #3429: Pull request #2359 synchronize by baberabb
September 26, 2024 15:08 12s cost
September 26, 2024 15:08 12s
fix cost_estimate script
Tasks Modified #3428: Pull request #2359 opened by baberabb
September 26, 2024 14:57 16s cost
September 26, 2024 14:57 16s
Treat tags in python tasks the same as yaml tasks (#2288)
Tasks Modified #3427: Commit b2bf7bc pushed by haileyschoelkopf
September 26, 2024 14:03 1m 36s main
September 26, 2024 14:03 1m 36s
Add metabench task to LM Evaluation Harness
Tasks Modified #3426: Pull request #2357 opened by kozzy97
September 26, 2024 13:39 2m 1s kozzy97:metabench
September 26, 2024 13:39 2m 1s