ML.ENERGY Leaderboard

title

emoji

python_version

app_file

sdk

sdk_version

pinned

ML.ENERGY Leaderboard

How much energy do LLMs consume?

This README focuses on explaining how to run the benchmark yourself. The actual leaderboard is here: https://ml.energy/leaderboard.

Colosseum

We instrumented Hugging Face TGI so that it measures and returns GPU energy consumption. Then, our controller server receives user prompts from the Gradio app, selects two models randomly, and streams model responses back with energy consumption.

Setup for benchmarking

Model weights

For models that are directly accessible in Hugging Face Hub, you don't need to do anything.
For other models, convert them to Hugging Face format and put them in /data/leaderboard/weights/lmsys/vicuna-13B, for example. The last two path components (e.g., lmsys/vicuna-13B) are taken as the name of the model.

Docker container

We have our pre-built Docker image published with the tag mlenergy/leaderboard:latest (Dockerfile).

$ docker run -it \
    --name leaderboard0 \
    --gpus '"device=0"' \
    -v /path/to/your/data/dir:/data/leaderboard \
    -v $(pwd):/workspace/leaderboard \
    mlenergy/leaderboard:latest bash

The container internally expects weights to be inside /data/leaderboard/weights (e.g., /data/leaderboard/weights/lmsys/vicuna-7B), and sets the Hugging Face cache directory to /data/leaderboard/hfcache. If needed, the repository should be mounted to /workspace/leaderboard to override the copy of the repository inside the container.

Running the benchmark

We run benchmarks using multiple nodes and GPUs using Pegasus. Take a look at pegasus/ for details.

You can still run benchmarks without Pegasus like this:

$ docker exec leaderboard0 python scripts/benchmark.py --model-path /data/leaderboard/weights/lmsys/vicuna-13B --input-file sharegpt/sg_90k_part1_html_cleaned_lang_first_sampled_sorted.json
$ docker exec leaderboard0 python scripts/benchmark.py --model-path databricks/dolly-v2-12b --input-file sharegpt/sg_90k_part1_html_cleaned_lang_first_sampled_sorted.json

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.github/workflows		.github/workflows
assets		assets
data		data
deployment		deployment
docs		docs
pegasus		pegasus
scripts		scripts
sharegpt		sharegpt
spitfight		spitfight
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
index.html		index.html
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML.ENERGY Leaderboard

Colosseum

Setup for benchmarking

Model weights

Docker container

Running the benchmark

About

Releases

Packages

Languages

License

SchwarzXia/leaderboard

Folders and files

Latest commit

History

Repository files navigation

ML.ENERGY Leaderboard

Colosseum

Setup for benchmarking

Model weights

Docker container

Running the benchmark

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages