-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support synthetic image generation in GenAI-Perf #754
Conversation
src/c++/perf_analyzer/genai-perf/genai_perf/llm_inputs/synthetic_image_generator.py
Show resolved
Hide resolved
while True: | ||
n = int(self.rng.normal(mean, stddev)) | ||
n = int(random.gauss(mean, stddev)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this using an offset or abs() instead of a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Changed to using abs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The abs
corrupts the gaussian distribution. The while
loop truncates it at zero. The same solution is used in SyntheticPromptGenerator:
client/src/c++/perf_analyzer/genai-perf/genai_perf/llm_inputs/synthetic_prompt_generator.py
Lines 120 to 123 in 3e1dbb1
def _sample_random_positive_int(cls, mean: int, stddev: int) -> int: | |
random_pos_int = -1 | |
while random_pos_int <= 0: | |
random_pos_int = int(random.gauss(mean, stddev)) |
But, I'm fine with abs+offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didnt think of that when i suggested it.
We can revert if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mwawrzos I think this is still a valid normal distribution, but a folded one (instead of truncated). It may affect the statistics, but I'm not too sure about the consequences of the slight shift in the statistics. But at the same time, I don't think this should be a huge concern since sampling image resolutions near zero seems like an unlikely case.
@nv-braf do you have any thoughts?
src/c++/perf_analyzer/genai-perf/genai_perf/llm_inputs/synthetic_image_generator.py
Show resolved
Hide resolved
* POC LLaVA VLM support (#720) * POC for LLaVA support * non-streaming request in VLM tests * image component sent in "image_url" field instead of HTML tag * generate sample image instead of loading from docs * add vision to endpoint mapping * fixes for handling OutputFormat * refactor - extract image preparation to a separate module * fixes to the refactor * replace match-case syntax with if-elseif-else * Update image payload format and fix tests * Few clean ups and tickets added for follow up tasks * Fix and add tests for vision format * Remove output format from profile data parser * Revert irrelevant code change * Revert changes * Remove unused dependency * Comment test_extra_inputs --------- Co-authored-by: Hyunjae Woo <[email protected]> * Support multi-modal input from file for OpenAI Chat Completions (#749) * add synthetic image generator (#751) * synthetic image generator * format randomization * images should be base64-encoded arbitrarly * randomized image format * randomized image shape * prepare SyntheticImageGenerator to support different image sources * read from files * python 3.10 support fixes * remove unused imports * skip sampled image sizes with negative values * formats type fix * remove unused variable * synthetic image generator encodes images to base64 * image format not randomized * sample each dimension independently Co-authored-by: Hyunjae Woo <[email protected]> * apply code-review suggestsions * update class name * deterministic synthetic image generator * add typing to SyntheticImageGenerator * SyntheticImageGenerator doesn't load files * SyntheticImageGenerator always encodes images to base64 * remove unused imports * generate gaussian noise instead of blank images --------- Co-authored-by: Hyunjae Woo <[email protected]> * Add command line arguments for synthetic image generation (#753) * Add CLI options for synthetic image generation * read image format from file when --input-file is used * move encode_image method to utils * Lazy import some modules * Support synthetic image generation in GenAI-Perf (#754) * support synthetic image generation for VLM model * add test * integrate sythetic image generator into LlmInputs * add source images for synthetic image data * use abs to get positive int --------- Co-authored-by: Marek Wawrzos <[email protected]>
SyntheticImageGenerator
intoLlmInputs