Add MVP LLM support to MA #783

nv-braf · 2023-11-01T18:31:29Z

This story is to merge my LLM branch to main. Here is what the output of a vLLM run looks like:

17:54:06 [Model Analyzer] Exporting server only metrics to /swdev/foo/results/metrics-server-only.csv
17:54:06 [Model Analyzer] Exporting inference metrics to /swdev/foo/results/metrics-model-inference.csv
Models (Inference):
Model   Batch   Periodic Concurrency   Request Period   Text Input Length   Max Tokens   Model Config Path   Instance Group   Avg First Token latency (ms)   Avg Token-to-Token latency (ms)  
vllm    1       1:1:1                  32               10                  128          vllm_config_0       1:GPU            23.6                           5.2                              
vllm    1       1:2:1                  32               10                  128          vllm_config_0       1:GPU            19.2                           5.3                              
vllm    1       1:4:1                  32               10                  128          vllm_config_0       1:GPU            20.7                           5.8                              
vllm    1       1:8:1                  32               10                  128          vllm_config_0       1:GPU            19.8                           6.0                              
vllm    1       1:16:1                 32               10                  128          vllm_config_0       1:GPU            22.0                           6.0                              
vllm    1       1:32:1                 32               10                  128          vllm_config_0       1:GPU            18.3                           5.5                              
vllm    1       1:64:1                 32               10                  128          vllm_config_0       1:GPU            20.7                           6.2                              
vllm    1       1:128:1                32               10                  128          vllm_config_0       1:GPU            19.1                           5.7                              
vllm    1       2:2:1                  32               10                  128          vllm_config_0       1:GPU            23.7                           5.8                              
vllm    1       2:4:1                  32               10                  128          vllm_config_0       1:GPU            23.6                           5.1                              
vllm    1       2:8:1                  32               10                  128          vllm_config_0       1:GPU            22.3                           5.8                              
vllm    1       2:16:1                 32               10                  128          vllm_config_0       1:GPU            22.7                           6.2                              
vllm    1       2:32:1                 32               10                  128          vllm_config_0       1:GPU            19.8                           5.6

* Update README and versions for 23.09 branch (#761) (#767) * Adding new options for LLM * Fixing codeQL issues * Fixing codeQL issue --------- Co-authored-by: Misha Chornyi <[email protected]>

* Initial coding complete * First unit test passing * Adding test for prompt length * Refactor PACG methods * Further refactoring * Ensure early exit isn't enabled for LLM models * Fix type checking errors * Attempt at fixing codeql issue * Revert "Attempt at fixing codeql issue" This reverts commit 2619b83. * Attempt at codeQL fix * Adding deepcopy back in * Removing deepcopy in an attempt to fix codeQL errors * Update model_analyzer/config/input/config_command_profile.py Co-authored-by: Hyunjae Woo <[email protected]> * Update model_analyzer/config/generate/perf_analyzer_config_generator.py Co-authored-by: Hyunjae Woo <[email protected]> * Update model_analyzer/config/generate/perf_analyzer_config_generator.py Co-authored-by: Hyunjae Woo <[email protected]> * Update model_analyzer/config/generate/perf_analyzer_config_generator.py Co-authored-by: Hyunjae Woo <[email protected]> * Moving location of method * Changing parameter to inference load * Changing parameter to inference load * Changing prompt length to text input length * Changing max_tokens to use request-parameter * Fix input-data typo * Changing non-parameter to parameter --------- Co-authored-by: Hyunjae Woo <[email protected]>

* New measurement fields created. * Fixing omission in llm_metric_table * Changing name to be avg_token_to_token...

* Added new config options and modified existing options * Refactoring model parameter setting * Removing magic numbers

* Initial code for aggregation of new LLM metrics * New measurement fields created. * Fixing PA unit tests * Adding hooks in metrics to capture new LLM fields * Fixing codeQL errors * Fixing type checking errors * Changes needed post-merge from other branches * Revert naming mistake (due to merge). * Changes uncovered during live testing * Fixes based on hwoo review * Fixing typo * Change to use lists and mean() * Changes based on hwoo review

* Created a new class ConfigRangeNumeric and using it for periodic-concurrency * Fixes and defaults for periodic concurrency * First unit test passing * PACG chagnes complete. Unit tests updated and passing * Removing uneeded class * Fixing codeQL and hwoo's review suggestions * Adding missing else

* Created a new class ConfigRangeNumeric and using it for periodic-concurrency * Fixes and defaults for periodic concurrency * First unit test passing * PACG chagnes complete. Unit tests updated and passing * Removing uneeded class * Changes to fix live run * Minor refactor and cleanup * Removing json files * Changing to use f-string * More cleanup from hwoo CR * Removing stale code for request period * Fix nit

* Changes to get LLM summary reports working * Addressing hwoo's CR

* Adding illegal LLM checks w/ unit testing + some minor cleanup * Updated with TMA

* General cleanup * Add ticket nums to todos

debermudez

Approving under the assumption that all of the previous work was reviewed in a side branch.

This reverts commit f15427e.

* Revert "Add MVP LLM support to MA (#783)" This reverts commit f15427e. * Fixing merge conficts

nv-braf and others added 12 commits October 3, 2023 08:05

Adding new options for LLM (#768)

c9d467f

* Update README and versions for 23.09 branch (#761) (#767) * Adding new options for LLM * Fixing codeQL issues * Fixing codeQL issue --------- Co-authored-by: Misha Chornyi <[email protected]>

New LLM record types (#770)

14ea528

* New measurement fields created. * Fixing omission in llm_metric_table * Changing name to be avg_token_to_token...

New config options based on live run (#775)

6e0fc24

* Added new config options and modified existing options * Refactoring model parameter setting * Removing magic numbers

Changes to get LLM summary reports working (#779)

f508625

* Changes to get LLM summary reports working * Addressing hwoo's CR

Adding illegal LLM checks w/ unit testing + some minor cleanup (#781)

8854589

* Adding illegal LLM checks w/ unit testing + some minor cleanup * Updated with TMA

Misc LLM cleanup (#782)

d9e075b

* General cleanup * Add ticket nums to todos

Fix for non-LLM breaking bug introduced.

f229273

summary table in progress

709531b

nv-braf requested review from debermudez and tgerdesnv November 1, 2023 21:10

debermudez approved these changes Nov 1, 2023

View reviewed changes

nv-braf merged commit f15427e into main Nov 3, 2023
3 checks passed

nv-braf added a commit that referenced this pull request Jan 23, 2024

Revert "Add MVP LLM support to MA (#783)"

accabf7

This reverts commit f15427e.

nv-braf added a commit that referenced this pull request Jan 23, 2024

Revert LLM changes (#818)

6019852

* Revert "Add MVP LLM support to MA (#783)" This reverts commit f15427e. * Fixing merge conficts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MVP LLM support to MA #783

Add MVP LLM support to MA #783

nv-braf commented Nov 1, 2023

debermudez left a comment

Add MVP LLM support to MA #783

Add MVP LLM support to MA #783

Conversation

nv-braf commented Nov 1, 2023

debermudez left a comment

Choose a reason for hiding this comment