Skip to content

Commit

Permalink
Update README template
Browse files Browse the repository at this point in the history
  • Loading branch information
dyastremsky committed Dec 13, 2024
1 parent 7b53c52 commit b827aa4
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions templates/genai-perf-templates/README_template
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ generative AI models as served through an inference server.
For large language models (LLMs), GenAI-Perf provides metrics such as
[output token throughput](#output_token_throughput_metric),
[time to first token](#time_to_first_token_metric),
[time to second token](#time_to_second_token_metric),
[inter token latency](#inter_token_latency_metric), and
[request throughput](#request_throughput_metric).
For a full list of metrics please see the [Metrics section](#metrics).
Expand Down Expand Up @@ -355,6 +356,7 @@ the inference server.
| Metric | Description | Aggregations |
| - | - | - |
| <span id="time_to_first_token_metric">Time to First Token</span> | Time between when a request is sent and when its first response is received, one value per request in benchmark | Avg, min, max, p99, p90, p75 |
| <span id="time_to_second_token_metric">Time to Second Token</span> | Time between when the first streaming response is received and when the second streaming response is received, one value per request in benchmark | Avg, min, max, p99, p90, p75 |
| <span id="inter_token_latency_metric">Inter Token Latency</span> | Time between intermediate responses for a single request divided by the number of generated tokens of the latter response, one value per response per request in benchmark | Avg, min, max, p99, p90, p75 |
| Request Latency | Time between when a request is sent and when its final response is received, one value per request in benchmark | Avg, min, max, p99, p90, p75 |
| Output Sequence Length | Total number of output tokens of a request, one value per request in benchmark | Avg, min, max, p99, p90, p75 |
Expand Down

0 comments on commit b827aa4

Please sign in to comment.