Profiling model using genai-perf #849

nv-braf · 2024-03-26T22:18:54Z

Successfully profiled and checkpointed a a vLLM run through MA using genai-perf. Here is the resulting table output:

Models (Inference):
Model       Batch   Concurrency   Model Config Path          Instance Group   Max Batch Size   Satisfies Constraints   Throughput (infer/sec)   p99 Latency (ms)   p99 Inter Token Latency (ms)   p99 Time To First Token (ms)   Output Token Throughput (infer/sec)  
gpt2_vllm   1       1             gpt2_vllm_config_default   1:MODEL          0                Yes                     8.9                      117.0              11.3                           47704.6                        76192.3                              

Models (GPU Metrics):
Model       GPU UUID                                   Batch   Concurrency   Model Config Path          Instance Group   Satisfies Constraints   GPU Memory Usage (MB)   GPU Utilization (%)   GPU Power Usage (W)  
gpt2_vllm   GPU-8557549f-9c89-4384-8bd6-1fd823c342e0   1       1             gpt2_vllm_config_default   1:MODEL          Yes                     12768.5                 20.4                  65.2                 

Server Only:
Model           GPU UUID                                   GPU Memory Usage (MB)   GPU Utilization (%)   GPU Power Usage (W)  
triton-server   GPU-8557549f-9c89-4384-8bd6-1fd823c342e0   12768.0                 0.5                   42.1

tgerdesnv

Great job! Awesome to see it able to run and print out results.

model_analyzer/perf_analyzer/perf_analyzer.py

model_analyzer/perf_analyzer/perf_config.py

model_analyzer/record/metrics_manager.py

* Initial changes to run genai-perf in MA * Gating call to get LLM records * Fixing captilization issue * Removing debug * Adding TODO --------- Co-authored-by: root <[email protected]>

* New Records for LLM metrics (#839) * Adding new LLM metrics * Adding base class for perf, inter_token, and time_to_first latency records * Add --llm-mode option (#842) * Adding CLI hook for LLM * Changing to use --model-type * Capture LLM metrics from genai-perf in MA (#844) * Successfully reading from LLM CSV * General cleanup * All unit tests passing * Fixing metric table typos * Fixing typos * Update constraints for LLMs (#845) * Adding LLM values to list of possible constraints * Fixing typo * Adding new output fields for LLM (#846) * Profiling model using genai-perf (#849) * Initial changes to run genai-perf in MA * Gating call to get LLM records * Fixing captilization issue * Removing debug * Adding TODO --------- Co-authored-by: root <[email protected]> * Add genai_perf CLI options to MA (#854) * Added support for genai_perf CLI * Remove dead code * Removing genai_perf collateral * Fixing codeQL issue * Adding streaming to genai_perf_config --------- Co-authored-by: root <[email protected]>

root and others added 3 commits March 25, 2024 23:37

Initial changes to run genai-perf in MA

ee764a5

Gating call to get LLM records

984d29d

Fixing captilization issue

f7b73fc

nv-braf requested a review from tgerdesnv March 26, 2024 22:18

Removing debug

cd2fac8

tgerdesnv reviewed Mar 27, 2024

View reviewed changes

model_analyzer/perf_analyzer/perf_analyzer.py Show resolved Hide resolved

model_analyzer/perf_analyzer/perf_config.py Show resolved Hide resolved

model_analyzer/record/metrics_manager.py Show resolved Hide resolved

model_analyzer/record/metrics_manager.py Show resolved Hide resolved

Adding TODO

227e7c1

nv-braf requested a review from tgerdesnv March 27, 2024 14:37

tgerdesnv approved these changes Mar 27, 2024

View reviewed changes

nv-braf merged commit 965ad1b into use-llm-metrics-in-ma Mar 27, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiling model using genai-perf #849

Profiling model using genai-perf #849

nv-braf commented Mar 26, 2024

tgerdesnv left a comment

Profiling model using genai-perf #849

Profiling model using genai-perf #849

Conversation

nv-braf commented Mar 26, 2024

tgerdesnv left a comment

Choose a reason for hiding this comment