New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add more detailed metrics to the LLM benchmarks #431

Merged

nv-hwoo merged 15 commits into main from hwoo-llm-offline

Nov 7, 2023

Contributor

nv-hwoo commented Nov 3, 2023 •

edited

Loading

Few changes

Support both online & offline LLM benchmarks (default is online)
Add more detailed metrics for latencies and throughputs

Users can run offline LLM benchmark by either setting --offline flag or setting "STREAM" input tensor to false through input JSON file.

Some new metrics added are

Max/min first token latency
p50/p90/p95/p99 first token latency
Avg/max/min generation latency
p50/p90/p95/p99 generation latency
Avg token latency
Avg/max/min end-to-end latency
Avg/max/min token throughput
p50/p90/p95/p99 token throughput
etc

nv-hwoo added 8 commits

October 29, 2023 11:35


          Enable offline benchmark

077ee5b


          Save profile results

277f16d


          Support TRT-LLM

f120771


          Change latencies to milliseconds

07a9c9c


          Add more metrics to benchmark

29b3cbb


          Update document

28b0dba


          Remove TRTLLM support

72c491e


          Add units to for each metric

35c965c

nv-hwoo requested review from debermudez, matthewkotila and tgerdesnv

November 3, 2023 05:08


          Separate out results csv

ca1b024

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved


          Remove csv output

f7fb9f1

github-advanced-security bot found potential problems

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Fixed Show fixed Hide fixed

src/c++/perf_analyzer/docs/examples/profile.py Fixed Show fixed Hide fixed

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Outdated Show resolved Hide resolved

matthewkotila reviewed

View reviewed changes

Contributor

matthewkotila left a comment

Done with my review. Looks pretty good overall. Thanks for working on this, Hyunjae 🙏

nv-hwoo added 3 commits

November 3, 2023 17:44


          Address feedback

e9e4a71


          Add more metrics and extract metric calculation to separate function

3fd5836


          Avoid loading data multiple times

d3c9011

nv-hwoo requested a review from matthewkotila

November 6, 2023 22:41

github-advanced-security bot found potential problems

View reviewed changes

src/c++/perf_analyzer/docs/examples/profile.py Fixed Show fixed Hide fixed

src/c++/perf_analyzer/docs/examples/profile.py Fixed Show fixed Hide fixed

src/c++/perf_analyzer/docs/examples/profile.py Fixed Show fixed Hide fixed

src/c++/perf_analyzer/docs/examples/profile.py Fixed Show fixed Hide fixed

matthewkotila approved these changes

View reviewed changes

nv-hwoo added 2 commits

November 6, 2023 16:05


          Do not output generation metrics when max tokens < 2

dcd6df6


          Fix codeql

6b8f310

nv-hwoo requested a review from matthewkotila

November 7, 2023 00:44

matthewkotila approved these changes

View reviewed changes

nv-hwoo merged commit 7afa27a into main

3 checks passed

nv-hwoo deleted the hwoo-llm-offline branch

November 7, 2023 16:57

nv-hwoo mentioned this pull request

[Bug fix] Avoid resetting token-to-token latencies list #432

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet