-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port Records and ModelConfigMeasurement classes #78
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nv-braf
force-pushed
the
port_records_and_mcm
branch
from
September 6, 2024 23:12
99b80fb
to
b0a2244
Compare
debermudez
reviewed
Sep 11, 2024
debermudez
approved these changes
Sep 16, 2024
nv-braf
added a commit
that referenced
this pull request
Sep 23, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Oct 1, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
pvijayakrish
pushed a commit
that referenced
this pull request
Oct 8, 2024
nv-braf
added a commit
that referenced
this pull request
Oct 31, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Nov 7, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Nov 18, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Nov 25, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Dec 9, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Dec 10, 2024
* Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file
nv-braf
added a commit
that referenced
this pull request
Dec 11, 2024
* Add SearchParameters class (#76) * Initial code done. Some unit testing in place * All unit tests passing + pre-commit changes * Fixing codeQL issue * Fixing pytest issue * Adding TypeAlias * Removing python 3.8 * Changes based on pre-review w/ Elias * Fixing codeQL issue * Removing type ignore * Fixing comment * Port Records and ModelConfigMeasurement classes (#78) * Adding Records and MCM. Very basic unit tests passing. * Fixes + all unit testing completed * Adding missing record testing + missing record file * Port Run Config Measurement (#91) * Initial changes, basic unit tests passing * Adding support for making the objective a telemetry metric * Calculation logic + unit testing added * Constraint logic in place. All unit tests passing * Fix codeQL issues. * Removing accidental negation * Create Optuna Objective Generator Class (#96) * Created type top-level file * Added logic and testing for search space methods * Added logic and unit testing for generating objectives * Fixing codeQL issues * Adding early termination logic * Fixing logger and adding debug methods * Adding end-to-end generator testing * Create Sweep Objective Generator (#104) * Creating sweep based objective generator * Refactoring and cleaning up type aliases * Fixing codeQL issues * Fixing generator count test * Changing get_list to assert versus return an empty list * Create PA Config Class (#110) * Initialization of class complete * Refactoring set options method * Added CLI string method * Adding representation method * Fixing codeQL issues * Changing asserts to use ValueError * Removing comment * Fixes based on CR * Removing try-except * Add Analyze to Search Parameters (#117) * Differentiating btw PA and GAP runtime parameters * Adding GAP options to config command * Adding GAP option to optimize * Adding logic for anaylze to search parameters * Fixing codeQL issues * Creating enum for subcommand * Create GenAI-Perf Config Class (#119) * Adding GAP option to optimize * Fixing codeQL issues * Adding config for genai-perf * Fixing codeQL issues * Create RunConfig class (#123) * Adding GAP option to optimize * Fixing codeQL issues * Adding RunConfig class along w/ missing checkpoint support to config classes * Create Results class (#132) * Adding GAP option to optimize * Fixing codeQL issues * Results class initial coding w/ testing * Minor refactor * Fixing issue in RCM testing * Fixing codeQL issues * Create checkpoint class (#134) * initial changes * Adding GAP option to optimize * Fixing codeQL issues * Results class initial coding w/ testing * Fixing issue in RCM testing * Fixing codeQL issues * Checkpoint class creation * Fixing codeQL issues * fixing codeql issue * Removing checkpoint file * Removing checkpoint file * Fixing json to properly format checkpoint file * Minor typing cleanup * Adding records for ISL/OSL and testing this in checkpoint creation * Changing method name * Changing read/write checkpoint method names * Turn statistics into GAP Records (#166) * Changing record names to match GAP and adding some missing type checking * Fixing other unit tests * Updating time to first token records * Updating inter token latency records * Updaing output token throughput record * Adding output token throughput per request records * Adding output sequence length (OSL) records * Adding Input sequence length (ISL) records * Removing non-GAP records * Adding telemetry records * Fixing unit testing * Adding request goodput record * Adding method to create records from statistics * Added very basic unit testing * Remove demo file (accidental commit) * Fix codeql error * Fixing merge issue * Fixes/Changes needed during testing Analyze subcommand (#177) * Fixes found during borecleaning * Fixing codeql issues * Add support for the current CLI to PA Config Generator (#182) * Added support for the CLI to PA config generator * Fixing codeQL issues * Removing redundant extra_args check * Add Analyze Subcommand (#186) * Adding CLI options for analyze along with the subcommand. Updates to underlying classes to support using the CLI. * Fixing codeQL issues * Actually raise the exception * Update help comment * Refactoring subcommands * Fixing codeql issues and other small changes from PR * Refactoring run method to be common btw profile and analyze subcommands * Fixing codeql issues * Fixing codeql * Adds support for skipping profiling if the Result is found in the checkpoint (#191) * Add support for skipping profiling if the results are found in the checkpoint * Fix codeql issue * Removing mutable default in RCM * Changes based on PR * fixing codeql issue * Capture Telemetry Records in checkpoint (#197) * Updated GPU Records to match what TelemetryRecords is doing. Added method to convert TelemetryDict into Records * Fix codeQL issues * Fixing unit test failures * Fixing file names to match tags * Fixing merge conflict * Analyze: Print CSV reports (#207) * Initial changes for printing report in analyze * Refactoring * Fixing codeql issues * Removing csv/checkpoint * Fixing merge conflicts. Updating num_prompts -> num_dataset_entries * Fixing codeql and mutable default issues. * Fixing remaining codeql issue * Adding new GAP options to ignore list in PA config generator
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ports the Record and ModelConfigMeasurement classes into GAP.
The Record class is used to store individual metrics (like latency or throughput).
The MCM class is used to store all the perf metrics for a model configuration and has methods to compare configurations, change objectives and checkpoint.