Skip to content

Commit

Permalink
Address feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
nv-hwoo committed Oct 26, 2023
1 parent 968a4c7 commit 54a4a09
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion src/c++/perf_analyzer/docs/examples/profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def print_benchmark_summary(results):
f"Average first-token latency: {avg_first_token_latency:.4f} sec"
)
output += (
f", Average token-token latency: {avg_token_to_token_latency:.4f} sec"
f", Average token-to-token latency: {avg_token_to_token_latency:.4f} sec"
if avg_token_to_token_latency
else ""
)
Expand Down
6 changes: 3 additions & 3 deletions src/c++/perf_analyzer/docs/llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,9 @@ python profile.py -m vllm --prompt-size-range 100 500 200 --max-tokens 256 --ign
# Sample output
# [ Benchmark Summary ]
# Prompt size: 100, Average first-token latency: 0.0388 sec, Average token-token latency: 0.0066 sec
# Prompt size: 300, Average first-token latency: 0.0431 sec, Average token-token latency: 0.0071 sec
# Prompt size: 500, Average first-token latency: 0.0400 sec, Average token-token latency: 0.0070 sec
# Prompt size: 100, Average first-token latency: 0.0388 sec, Average token-to-token latency: 0.0066 sec
# Prompt size: 300, Average first-token latency: 0.0431 sec, Average token-to-token latency: 0.0071 sec
# Prompt size: 500, Average first-token latency: 0.0400 sec, Average token-to-token latency: 0.0070 sec
```
## Benchmark 3: Profiling In-Flight Batching
Expand Down

0 comments on commit 54a4a09

Please sign in to comment.