Add first-token latency

Co-authored-by: Matthew Kotila <[email protected]>
triton-inference-server · Nov 27, 2023 · 7623a36 · 7623a36
1 parent 6f4b27e
commit 7623a36
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/src/c++/perf_analyzer/README.md b/src/c++/perf_analyzer/README.md
@@ -73,8 +73,8 @@ changes in performance as you experiment with different optimization strategies.
   [TorchServe](docs/benchmarking.md#benchmarking-torchserve) can be used as the
   inference server in addition to the default Triton server
 
-- [LLMs](docs/llm.md) can also be measured and charcterized with specific metrics
-  like token-to-token latency
+- [LLMs](docs/llm.md) can also be measured and characterized with specific metrics
+  like first-token latency and token-to-token latency
 
 <br>