Skip to content

Actions: EleutherAI/lm-evaluation-harness

Unit Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
3,128 workflow runs
3,128 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

allow fewshots for multimodal tasks
Unit Tests #3631: Pull request #2450 synchronize by artemorloff
November 1, 2024 18:48 3m 40s artemorloff:feature/multimodal_sampler
November 1, 2024 18:48 3m 40s
Add missing task links (#2449)
Unit Tests #3629: Commit ade1cc4 pushed by baberabb
November 1, 2024 13:54 6m 42s main
November 1, 2024 13:54 6m 42s
Add missing task links
Unit Tests #3628: Pull request #2449 opened by Sypherd
November 1, 2024 11:44 6m 31s Sypherd:main
November 1, 2024 11:44 6m 31s
add Russian mmlu
Unit Tests #3627: Pull request #2378 synchronize by tatiana-iazykova
November 1, 2024 09:55 Action required tatiana-iazykova:main
November 1, 2024 09:55 Action required
Add Japanese Leaderboard
Unit Tests #3626: Pull request #2439 synchronize by sitfoxfly
November 1, 2024 05:36 6m 33s sitfoxfly:japanese_leaderboard
November 1, 2024 05:36 6m 33s
Add GPTQModel support for evaluating GPTQ models (#2217)
Unit Tests #3625: Commit 4f8e479 pushed by baberabb
October 31, 2024 16:15 6m 42s main
October 31, 2024 16:15 6m 42s
OpenAI ChatCompletions: switch max_tokens
Unit Tests #3623: Pull request #2443 synchronize by baberabb
October 31, 2024 09:14 6m 10s openaichat
October 31, 2024 09:14 6m 10s
Add Aggregation for Kobest Benchmark
Unit Tests #3622: Pull request #2446 synchronize by tryumanshow
October 31, 2024 04:46 6m 16s tryumanshow:kobest-agg
October 31, 2024 04:46 6m 16s
Add Aggregation for Kobest Benchmark
Unit Tests #3621: Pull request #2446 synchronize by tryumanshow
October 31, 2024 04:43 6m 6s tryumanshow:kobest-agg
October 31, 2024 04:43 6m 6s
Add Aggregation for Kobest Benchmark
Unit Tests #3620: Pull request #2446 synchronize by tryumanshow
October 31, 2024 04:40 6m 19s tryumanshow:kobest-agg
October 31, 2024 04:40 6m 19s
Add Aggregation for Kobest Benchmark
Unit Tests #3619: Pull request #2446 synchronize by tryumanshow
October 31, 2024 04:39 2m 32s tryumanshow:kobest-agg
October 31, 2024 04:39 2m 32s
Add Aggregation for Kobest Benchmark
Unit Tests #3618: Pull request #2446 synchronize by tryumanshow
October 31, 2024 04:35 6m 17s tryumanshow:kobest-agg
October 31, 2024 04:35 6m 17s
Add Aggregation for Kobest Benchmark
Unit Tests #3617: Pull request #2446 opened by tryumanshow
October 31, 2024 04:29 6m 27s tryumanshow:kobest-agg
October 31, 2024 04:29 6m 27s
mlx Model (loglikelihood & generate_until)
Unit Tests #3616: Pull request #1902 synchronize by chimezie
October 30, 2024 20:00 Action required chimezie:mlx
October 30, 2024 20:00 Action required
OpenAI ChatCompletions: switch max_tokens
Unit Tests #3615: Pull request #2443 synchronize by baberabb
October 30, 2024 15:11 5m 49s openaichat
October 30, 2024 15:11 5m 49s
Add verify_certificate argument to local-completion (#2440)
Unit Tests #3614: Commit 57272b6 pushed by baberabb
October 30, 2024 14:42 6m 37s main
October 30, 2024 14:42 6m 37s
Add xquad task (#2435)
Unit Tests #3613: Commit b40a20a pushed by baberabb
October 30, 2024 14:36 6m 18s main
October 30, 2024 14:36 6m 18s
Add xquad task
Unit Tests #3612: Pull request #2435 synchronize by zxcvuser
October 30, 2024 14:26 5m 49s zxcvuser:add_xquad_task
October 30, 2024 14:26 5m 49s
Add Japanese Leaderboard
Unit Tests #3611: Pull request #2439 synchronize by sitfoxfly
October 30, 2024 14:09 6m 2s sitfoxfly:japanese_leaderboard
October 30, 2024 14:09 6m 2s
OpenAI ChatCompletions: switch max_tokens
Unit Tests #3610: Pull request #2443 synchronize by baberabb
October 30, 2024 13:30 6m 58s openaichat
October 30, 2024 13:30 6m 58s
Add verify_certificate argument to local-completion
Unit Tests #3609: Pull request #2440 synchronize by sjmonson
October 30, 2024 13:20 6m 50s sjmonson:fix/custom_certs
October 30, 2024 13:20 6m 50s
Add Japanese Leaderboard
Unit Tests #3608: Pull request #2439 synchronize by sitfoxfly
October 30, 2024 13:03 6m 17s sitfoxfly:japanese_leaderboard
October 30, 2024 13:03 6m 17s
Add Japanese Leaderboard
Unit Tests #3607: Pull request #2439 synchronize by sitfoxfly
October 30, 2024 12:50 Action required sitfoxfly:japanese_leaderboard
October 30, 2024 12:50 Action required