Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyze: Print CSV reports #207

Merged
merged 4 commits into from
Dec 9, 2024

Conversation

nv-braf
Copy link
Contributor

@nv-braf nv-braf commented Dec 6, 2024

Will now write CSV reports to the artifacts directory (identical to what is done when we profile). JSON and screen are skipped.
A new summary report is created in the CWD - called analyze_export_genai_perf.csv.

Here is an example of it's output:

image

The first table is: config name + stimulus + perf_metrics (p99)
The second table is: config name + gpu_metrics

@nv-braf nv-braf changed the title Initial changes for printing report in analyze Analyze: Print CSV reports Dec 6, 2024
@nv-braf nv-braf marked this pull request as ready for review December 9, 2024 16:24
Copy link
Contributor

@debermudez debermudez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of comments, overall looks good

@nv-braf nv-braf merged commit 829de72 into analyze_subcommand_phase1 Dec 9, 2024
6 of 7 checks passed
@nv-braf nv-braf deleted the create_csv_output_for_analyze branch December 9, 2024 21:49
nv-braf added a commit that referenced this pull request Dec 9, 2024
* Initial changes for printing report in analyze

* Refactoring

* Fixing codeql issues

* Removing csv/checkpoint
nv-braf added a commit that referenced this pull request Dec 10, 2024
* Initial changes for printing report in analyze

* Refactoring

* Fixing codeql issues

* Removing csv/checkpoint
nv-braf added a commit that referenced this pull request Dec 11, 2024
* Add SearchParameters class (#76)

* Initial code done. Some unit testing in place

* All unit tests passing + pre-commit changes

* Fixing codeQL issue

* Fixing pytest issue

* Adding TypeAlias

* Removing python 3.8

* Changes based on pre-review w/ Elias

* Fixing codeQL issue

* Removing type ignore

* Fixing comment

* Port Records and ModelConfigMeasurement classes (#78)

* Adding Records and MCM. Very basic unit tests passing.

* Fixes + all unit testing completed

* Adding missing record testing + missing record file

* Port Run Config Measurement (#91)

* Initial changes, basic unit tests passing

* Adding support for making the objective a telemetry metric

* Calculation logic + unit testing added

* Constraint logic in place. All unit tests passing

* Fix codeQL issues.

* Removing accidental negation

* Create Optuna Objective Generator Class (#96)

* Created type top-level file

* Added logic and testing for search space methods

* Added logic and unit testing for generating objectives

* Fixing codeQL issues

* Adding early termination logic

* Fixing logger and adding debug methods

* Adding end-to-end generator testing

* Create Sweep Objective Generator (#104)

* Creating sweep based objective generator

* Refactoring and cleaning up type aliases

* Fixing codeQL issues

* Fixing generator count test

* Changing get_list to assert versus return an empty list

* Create PA Config Class (#110)

* Initialization of class complete

* Refactoring set options method

* Added CLI string method

* Adding representation method

* Fixing codeQL issues

* Changing asserts to use ValueError

* Removing comment

* Fixes based on CR

* Removing try-except

* Add Analyze to Search Parameters (#117)

* Differentiating btw PA and GAP runtime parameters

* Adding GAP options to config command

* Adding GAP option to optimize

* Adding logic for anaylze to search parameters

* Fixing codeQL issues

* Creating enum for subcommand

* Create GenAI-Perf Config Class (#119)

* Adding GAP option to optimize

* Fixing codeQL issues

* Adding config for genai-perf

* Fixing codeQL issues

* Create RunConfig class (#123)

* Adding GAP option to optimize

* Fixing codeQL issues

* Adding RunConfig class along w/ missing checkpoint support to config classes

* Create Results class (#132)

* Adding GAP option to optimize

* Fixing codeQL issues

* Results class initial coding w/ testing

* Minor refactor

* Fixing issue in RCM testing

* Fixing codeQL issues

* Create checkpoint class (#134)

* initial changes

* Adding GAP option to optimize

* Fixing codeQL issues

* Results class initial coding w/ testing

* Fixing issue in RCM testing

* Fixing codeQL issues

* Checkpoint class creation

* Fixing codeQL issues

* fixing codeql issue

* Removing checkpoint file

* Removing checkpoint file

* Fixing json to properly format checkpoint file

* Minor typing cleanup

* Adding records for ISL/OSL and testing this in checkpoint creation

* Changing method name

* Changing read/write checkpoint method names

* Turn statistics into GAP Records (#166)

* Changing record names to match GAP and adding some missing type checking

* Fixing other unit tests

* Updating time to first token records

* Updating inter token latency records

* Updaing output token throughput record

* Adding output token throughput per request records

* Adding output sequence length (OSL) records

* Adding Input sequence length (ISL) records

* Removing non-GAP records

* Adding telemetry records

* Fixing unit testing

* Adding request goodput record

* Adding method to create records from statistics

* Added very basic unit testing

* Remove demo file (accidental commit)

* Fix codeql error

* Fixing merge issue

* Fixes/Changes needed during testing Analyze subcommand (#177)

* Fixes found during borecleaning

* Fixing codeql issues

* Add support for the current CLI to PA Config Generator (#182)

* Added support for the CLI to PA config generator

* Fixing codeQL issues

* Removing redundant extra_args check

* Add Analyze Subcommand (#186)

* Adding CLI options for analyze along with the subcommand. Updates to underlying classes to support using the CLI.

* Fixing codeQL issues

* Actually raise the exception

* Update help comment

* Refactoring subcommands

* Fixing codeql issues and other small changes from PR

* Refactoring run method to be common btw profile and analyze subcommands

* Fixing codeql issues

* Fixing codeql

* Adds support for skipping profiling if the Result is found in the checkpoint (#191)

* Add support for skipping profiling if the results are found in the checkpoint

* Fix codeql issue

* Removing mutable default in RCM

* Changes based on PR

* fixing codeql issue

* Capture Telemetry Records in checkpoint (#197)

* Updated GPU Records to match what TelemetryRecords is doing. Added method to convert TelemetryDict into Records

* Fix codeQL issues

* Fixing unit test failures

* Fixing file names to match tags

* Fixing merge conflict

* Analyze: Print CSV reports (#207)

* Initial changes for printing report in analyze

* Refactoring

* Fixing codeql issues

* Removing csv/checkpoint

* Fixing merge conflicts. Updating num_prompts -> num_dataset_entries

* Fixing codeql and mutable default issues.

* Fixing remaining codeql issue

* Adding new GAP options to ignore list in PA config generator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants