-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom intervals #814
base: main
Are you sure you want to change the base?
Conversation
model_analyzer/config/generate/brute_plus_binary_search_run_config_generator.py
Fixed
Show fixed
Hide fixed
model_analyzer/config/generate/quick_plus_concurrency_sweep_run_config_generator.py
Fixed
Show fixed
Hide fixed
if self._cli_config.is_llm_model(): | ||
# The possible inference loads are concurrency, request rate, periodic concurrency, or custom (request-intervals) | ||
# - If custom is specified, it is used | ||
# - For LLM models, periodic concurrency is used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rework comment as LLM has gone away
model_analyzer/config/generate/perf_analyzer_config_generator.py
Outdated
Show resolved
Hide resolved
@@ -153,3 +157,9 @@ def _set_concurrency(self, run_config: RunConfig, concurrency: int) -> RunConfig | |||
perf_config.update_config({"concurrency-range": concurrency}) | |||
|
|||
return run_config | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is duplicated (also in brute search). Maybe this should be a static method in ModelProfileSpec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is still duplicated. I didn't clean it up yet. Both classes implement ConfigGeneratorInterface. You could create a base class with common code if you want.
@@ -511,9 +511,12 @@ def _get_next_perf_analyzer_config( | |||
|
|||
perf_analyzer_config.update_config_from_profile_config(model_name, self._config) | |||
|
|||
concurrency = self._calculate_concurrency(dimension_values) | |||
perf_config_params = {"batch-size": 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it'd be cleaner if the PerfAnalyzerConfig() constructor initialized batch-size: 1
. Seems like we are always needed to add this in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done
1b24043
to
38abbe8
Compare
fae1195
to
6b3a199
Compare
"concurrency-range": default_concurrency, | ||
} | ||
default_perf_analyzer_config.update_config(perf_config_params) | ||
if not "request-intervals" in model.perf_analyzer_flags(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a self-review: I'm wondering if this should be if not model.is_load_specified()
just like line 515
Fix support for Perf Analyzer's request-intervals
Fixes #808