triton-inference-server · dyastremsky · Jul 10, 2024 · Jul 10, 2024 · Jul 10, 2024
diff --git a/src/c++/perf_analyzer/docs/cli.md b/src/c++/perf_analyzer/docs/cli.md
@@ -157,6 +157,13 @@ will also be reported in the results.
 Default is `-1` indicating that the average latency is used to determine
 stability.
 
+#### `--request-count=<n>`
+
+Specifies a total number of requests to use for measurement.
+
+Default is `0`, which means that there is no request count and the measurement
+will proceed using windows until stabilization is detected.
+
 #### `-r <n>`
 #### `--max-trials=<n>`
 

diff --git a/src/c++/perf_analyzer/genai-perf/README.md b/src/c++/perf_analyzer/genai-perf/README.md
@@ -301,8 +301,8 @@ options:
 
 When the dataset is coming from a file, you can specify the following
 options:
-* `--input-file <path>`: The input file containing the single prompt to
-  use for benchmarking.
+* `--input-file <path>`: The input file containing the prompts to
+  use for benchmarking as JSON objects.
 
 For any dataset, you can specify the following options:
 * `--output-tokens-mean <int>`: The mean number of tokens in each output. Ensure