Skip to content

Commit

Permalink
Fix sample output
Browse files Browse the repository at this point in the history
  • Loading branch information
nv-hwoo committed Oct 26, 2023
1 parent f695177 commit bd3127b
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 6 deletions.
5 changes: 4 additions & 1 deletion src/c++/perf_analyzer/docs/examples/profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,10 @@ def summarize_profile_results(args, prompts):
print_benchmark_summary(results)

if args.periodic_concurrency_range:
print("Saved in-flight benchmark plots @ 'inflight_batching_benchmark-*.png'.")
print(
"Saved in-flight batching benchmark plots "
"@ 'inflight_batching_benchmark-*.png'."
)


def profile(args, export_file):
Expand Down
10 changes: 5 additions & 5 deletions src/c++/perf_analyzer/docs/llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,9 @@ python profile.py -m vllm --prompt-size-range 100 500 200 --max-tokens 256 --ign
# Sample output
# [ Benchmark Summary ]
# Prompt size: 100, Average first-token latency: 0.0388 sec, Average token-to-token latency: 0.0066 sec
# Prompt size: 300, Average first-token latency: 0.0431 sec, Average token-to-token latency: 0.0071 sec
# Prompt size: 500, Average first-token latency: 0.0400 sec, Average token-to-token latency: 0.0070 sec
# Prompt size: 100, Average first-token latency: 0.0388 sec, Average total token-to-token latency: 0.0066 sec
# Prompt size: 300, Average first-token latency: 0.0431 sec, Average total token-to-token latency: 0.0071 sec
# Prompt size: 500, Average first-token latency: 0.0400 sec, Average total token-to-token latency: 0.0070 sec
```
## Benchmark 3: Profiling In-Flight Batching
Expand Down Expand Up @@ -164,9 +164,9 @@ python profile.py -m vllm --prompt-size-range 10 10 1 --periodic-concurrency-ran

# Sample output
# [ BENCHMARK SUMMARY ]
# Prompt size: 10, Average first-token latency: 0.0799 sec, Average total token-token latency: 0.0324 sec
# Prompt size: 10, Average first-token latency: 0.0799 sec, Average total token-to-token latency: 0.0324 sec
#
# Saved in-flight benchmark plots @ 'inflight_batching_benchmark-*.png'.
# Saved in-flight batching benchmark plots @ 'inflight_batching_benchmark-*.png'.
```

The resulting plot will look like
Expand Down

0 comments on commit bd3127b

Please sign in to comment.