-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repo Refactor #622
Draft
lockshaw
wants to merge
686
commits into
master
Choose a base branch
from
repo-refactor
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Repo Refactor #622
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@reyna-abhyankar Does this fix #455? |
@reyna-abhyankar @williamberman Approximate PR changelist added! |
lockshaw
commented
May 14, 2023
lockshaw
commented
May 14, 2023
lib/runtime/src/ops/dropout.cc
Outdated
|
||
void Dropout::forward_task(Task const *task, | ||
static void forward_task(Task const *task, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the new task
vs task_impl
change
lockshaw
commented
May 14, 2023
lockshaw
commented
May 14, 2023
lockshaw
commented
May 14, 2023
This was referenced Jun 16, 2023
This was referenced Jun 24, 2023
* fix build errors * delete PerDeviceOpState
* fix build errors * delete PerDeviceOpState * delete includes
* Graph documentation * Updated graph docs * Updates to generate_diagram.py * Minor graph documentation fix * minor fix * add hpp2plantuml * rm mispelled nix file --------- Co-authored-by: Pietro Max Marsella <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Qinghan Chen <[email protected]>
* Add allocators * Computation Graph and Builder * Shift ops and remove legion names * Format * Format * Fix tracked allocator * Fix comp graph * Add task spec * Minor build issues * Build op task spec * Build ops and op task spec * Simplify edge set obtain * Format * Fixes * Fix conflicts, some renaming * Fix gather kernels * Finish gather operator * Format * Fix substitutions * Fix legion dim in gather * Format string fixes * Fix include * Gather backward time * Format --------- Co-authored-by: Colin Unger <[email protected]>
… kernels (#1369) * hip -e * format fix * small fix to binary * fixed problems * fix alpha --------- Co-authored-by: Reyna Abhyankar <[email protected]>
* Add initial lib/substitution-generator and bin/substitutions-to-dot * Format * Update proj version and add .proj.toml file to repo directly * Revert changes to flake.nix * Prototype implementation of dtgen * Refactor op-attrs to use dtgen * Format * More dtgen'ing * Re-pass tests * Simplify types in substitutions, more dtgen * Add new reduction dim shape inference for conv2d * Move conv2d input parsing into public headers * Remove incorrect not_implemented * Add pcg tests * Add initial test for pcg * Partial implementation of shape inference for linear * Fix rapidcheck (#8) * enable rapidchecks for op-attrs * added rc::checks * fix merge * fixed variant toml * revert proj.toml * removed additional import merged * constraint for ff_dim * lock flake --------- Co-authored-by: Rae Wong <[email protected]> * Attempt to hide dtgen-generated files from github diff * Fix header file name for dtgen in gitattributes * Update proj and format code * Add initial shape inference for BMM * Add half of shape inference for Attention * Finish initial shape inference for Attention * Enable op-attrs and pcg tests in CI * Add parallel shape inference for add and relu * Add parallel shape inference for embedding * Add shape inference for repartition, combine, replicate, and reduction * Include tests for reduction * Address wmdi comments * Fixup linear shape inference, add tests for linear * Fix tests * Format * Fix build errors * change lcov in ci to rm dtgen coverage * Remove dtgen from coverage * Temporarily disable substitutions build in CI * Format * Fix substitution-generator build and tests * Format * fix ci coverage attempt * second attempt * small fix * small fix --------- Co-authored-by: Rae Wong <[email protected]> Co-authored-by: Rae Wong <[email protected]> Co-authored-by: Qinghan Chen <[email protected]>
* add codecov change * comment it out to see whether it is true * add pycoverage file back * disable pycoverage and verbose * try rm folder codecov yml * add weird flag and recover yml * add ignore path * final change * ignore path * delete lcov * add ignore under codecov * not ignore deps * clean comment
* refactor for softmax, split, topk, transpose * fix --------- Co-authored-by: Reyna Abhyankar <[email protected]>
* repo refactor * recovered conv_2d and replaced cudaStream_t with hipStream_t * fix dim3 * changed path * fix
…1403) * refactor for reduce, reduction, replicate, reshape and reverse * fix * fix miopenOpTensor --------- Co-authored-by: Reyna Abhyankar <[email protected]>
* update hip * small fix * fix * format fix --------- Co-authored-by: Reyna Abhyankar <[email protected]>
* Add allocators * Computation Graph and Builder * Shift ops and remove legion names * Format * Format * Fix tracked allocator * Fix comp graph * Add task spec * Minor build issues * Build op task spec * Build ops and op task spec * Simplify edge set obtain * Format * Fixes * Fix conflicts, some renaming * Fix gather kernels * Finish gather operator * Format * Fix substitutions * Add local task arg accessor * Add local training backing and task registry * Fix legion dim in gather * Format string fixes * Fix include * Gather backward time * Format * Update per-lib-check.yml * Fixes * Fixes * Fix remaining issues * Format * Remove constructor PTS * Format * Fix device states and build local execution * Change signature * Format * Fixes * Change type to label * Fix template issue, example on ElementBinary * Task Impl and Signature * Remove deadcode * Split get task ids * Build merged * Revert back element unary * Fix * Delete param_sync.h --------- Co-authored-by: Colin Unger <[email protected]>
* Add compiler to cmakelists * Local Cost Estimator * Format + temp add ff handle fn * Cost estimator fixes * Remove ref * Fix gradient tensor allocation * Remove comment * Latest fixes * Add memory cost * Fixes * Fix get piece shape * Add eq, ord to costdetails * Fix build
* pr for debugging kernel driver issues * Commit flake files * current kernel tests * softmax, flat, transpose kernel tests * clang formatting kernel tests * reverse, split, full dropout kernels * rest of kernel-tests * minor cleannup * Restore .proj.toml * Delete misadded directory * merge fix * more merge fixes * resolved merge conflicts with repo-refactor * code review changes * allocator updates * allocation util updates * test clean up and review fixes * fixed forward backward pass consistencies, added filler tests for all tests, other review changes * unnested test subcases and more review changes * added managed_stream and handle classes, other minor clean up * fix accessor and corresponding shape clarity, other clean up * merge error fixes * managed handle and stream fixes, removed datatype dispatch from cuda_helper, other clean up * managed handle and stream updates * fixed deallocator --------- Co-authored-by: Reyna Abhyankar <[email protected]>
…1429) * initial commit for machine view adjacent modules * Formatting * Tests for new machine_view.cc functions * formatting * Minor Test correction * formatting * PR fixes * PR Fixes --------- Co-authored-by: Pietro Max Marsella <[email protected]>
* Start on pcg builder * Add tests and some implementation for pcg builder * Add pcg tests, make dtgen constructors explicit to fix bug * Add remainder of PCG tests * Fix build issues in local-execution * Format * Address Reyna comments, add topological_order function for PCG * Pre multidigraph refactor * Removing visitable from sp code * Add open dataflow graph, start to replace pcg dataflow graph * Start refactoring substitutions * Add utility functions to support pattern matching * Pre-refactor inputs * Fix proj url * Get back to substitutions, now with unordered graph inputs * Get substitutions building * substitutions-tests now builds * Fix bug in filter, pass some initial substitution tests * Add tests for fmt::to_string, fix some substitutions bugs * Pass initial unit tests for find_pattern_matches * Start on unit tests for pcg pattern * Pass initial test for find_pattern_matches * Fix small build issue in tests * Format * Sync tests in CI with tests in proj * Fix minor build errors in kernels and local-execution * Format * Remove outdated code * More outdated code removal * More cleanup, add test for sp decomposition * Pull apart containers.h * More sp testing and fixes * Break up graph algorithms.h * Pre- full SP algo commit * Add initial implementation and tests for cbc decomposition and inverse line graph * Pass test for get_inverse_line_graph * Add new multidigraph * Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph * Add tests for parallel and series reduction finding * Add really rough implementation of valdez sp decomposition * Fix local-execution build * Add implementations and tests for applying series/parallel reductions * Format * Clean up sp decomposition interface and tests * Format * Add comments for top-level substitutions functions, add proj doxygen support * Fix proj invocation in CI
* Start on pcg builder * Add tests and some implementation for pcg builder * Add pcg tests, make dtgen constructors explicit to fix bug * Add remainder of PCG tests * Fix build issues in local-execution * Format * Address Reyna comments, add topological_order function for PCG * Pre multidigraph refactor * Removing visitable from sp code * Add open dataflow graph, start to replace pcg dataflow graph * Start refactoring substitutions * Add utility functions to support pattern matching * Pre-refactor inputs * Fix proj url * Get back to substitutions, now with unordered graph inputs * Get substitutions building * substitutions-tests now builds * Fix bug in filter, pass some initial substitution tests * Add tests for fmt::to_string, fix some substitutions bugs * Pass initial unit tests for find_pattern_matches * Start on unit tests for pcg pattern * Pass initial test for find_pattern_matches * Fix small build issue in tests * Format * Sync tests in CI with tests in proj * Fix minor build errors in kernels and local-execution * Format * Remove outdated code * More outdated code removal * More cleanup, add test for sp decomposition * Pull apart containers.h * More sp testing and fixes * Break up graph algorithms.h * Pre- full SP algo commit * Add initial implementation and tests for cbc decomposition and inverse line graph * Pass test for get_inverse_line_graph * Add new multidigraph * Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph * Add tests for parallel and series reduction finding * Add really rough implementation of valdez sp decomposition * Fix local-execution build * Add implementations and tests for applying series/parallel reductions * Format * Clean up sp decomposition interface and tests * Format * integrating compiler with dataflow-graph * Add comments for top-level substitutions functions, add proj doxygen support * Fix proj invocation in CI * minor changes --------- Co-authored-by: Pietro Max Marsella <[email protected]>
Co-authored-by: Mengdi Wu <[email protected]>
* pr for debugging kernel driver issues * Commit flake files * current kernel tests * softmax, flat, transpose kernel tests * clang formatting kernel tests * reverse, split, full dropout kernels * rest of kernel-tests * minor cleannup * Restore .proj.toml * Delete misadded directory * merge fix * more merge fixes * resolved merge conflicts with repo-refactor * code review changes * allocator updates * allocation util updates * test clean up and review fixes * fixed forward backward pass consistencies, added filler tests for all tests, other review changes * unnested test subcases and more review changes * Add == in OpTaskBinding * Add single operator test example * Finish multi operator test * added managed_stream and handle classes, other minor clean up * fix accessor and corresponding shape clarity, other clean up * merge error fixes * More aggressive subcasing * Remove comment * managed handle and stream fixes, removed datatype dispatch from cuda_helper, other clean up * managed handle and stream updates * Refactoring and split tests * Fix build * Fix build * Add cuda test suite * Remove mock * Pass task registry * Pass slots backing and task arg acc * Pass cost estimator test * Fix * PR fixes * Fixes * Add test to ci * Fix test libs * Fix build, add more fmt placeholders * Fixes * Fixes * Delete file * Fixes * Fixes * Fixes * Fix includes * Fix includes --------- Co-authored-by: Dylan Lim <[email protected]> Co-authored-by: Dylan Lim <[email protected]> Co-authored-by: Colin Unger <[email protected]>
* Add transformer pcg prototype * Update based on review comments * Add models makefile and test * Update * Pass test * Enhance Transformer implementation * Reflect review comments * [WIP] Save initial refactor * Implement actual encorder decoder architecture * Remove duplicated definition * Update based on review * Update argument order * Implement most of the shape inference and ComputationGraphBuilder support * Fix bug in LayerNorm shape inference tests, disable {?} doctest default * Fix transformer test * Format * Few small fixes in respose to PR comments * Add asserts for transformer layer shapes * Make config default * Small style fixes and some additional docs --------- Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Reyna Abhyankar <[email protected]>
* Start on pcg builder * Add tests and some implementation for pcg builder * Add pcg tests, make dtgen constructors explicit to fix bug * Add remainder of PCG tests * Fix build issues in local-execution * Format * Address Reyna comments, add topological_order function for PCG * Pre multidigraph refactor * Removing visitable from sp code * Add open dataflow graph, start to replace pcg dataflow graph * Start refactoring substitutions * Add utility functions to support pattern matching * Pre-refactor inputs * Fix proj url * Get back to substitutions, now with unordered graph inputs * Get substitutions building * substitutions-tests now builds * Fix bug in filter, pass some initial substitution tests * Add tests for fmt::to_string, fix some substitutions bugs * Pass initial unit tests for find_pattern_matches * Start on unit tests for pcg pattern * Pass initial test for find_pattern_matches * Fix small build issue in tests * Format * Sync tests in CI with tests in proj * Fix minor build errors in kernels and local-execution * Format * Remove outdated code * More outdated code removal * More cleanup, add test for sp decomposition * Pull apart containers.h * More sp testing and fixes * Break up graph algorithms.h * Pre- full SP algo commit * Add initial implementation and tests for cbc decomposition and inverse line graph * Pass test for get_inverse_line_graph * Add new multidigraph * Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph * Add tests for parallel and series reduction finding * Add really rough implementation of valdez sp decomposition * Fix local-execution build * Add implementations and tests for applying series/parallel reductions * Format * Clean up sp decomposition interface and tests * Format * Add comments for top-level substitutions functions, add proj doxygen support * Start sketching out substitutions code * Fix build errors * Add ability to permute node ids * Cleanup and start to test new substitutions code * Add test case for evaluate_substitution_output * Add naive isomorphism detection code * Add graph inputs to open dataflow graph isomorphism * Add input permutation to evaluate_substitution_output * Fix permute_node_ids * Add test for permute_input_ids * Migrate over to mutable implementation of apply_substitution * Add fast isomorphism checking and an initial implementation of full substitution logic * Pass initial full substitutions test * Cleanup old isomorphism checking code * Fix post-merge bugs * Fix broken pcg builder test * Format * Reorganize code and remove some outdated code pre-code-review * Format * Address review comments * Address missed comment * Remove latex dependency to avoid CI out-of-disk-space * Format * Fix build issues * Fix incorrect test case
…ompositions (#1490) * Start on pcg builder * Add tests and some implementation for pcg builder * Add pcg tests, make dtgen constructors explicit to fix bug * Add remainder of PCG tests * Fix build issues in local-execution * Format * Address Reyna comments, add topological_order function for PCG * Pre multidigraph refactor * Removing visitable from sp code * Add open dataflow graph, start to replace pcg dataflow graph * Start refactoring substitutions * Add utility functions to support pattern matching * Pre-refactor inputs * Fix proj url * Get back to substitutions, now with unordered graph inputs * Get substitutions building * substitutions-tests now builds * Fix bug in filter, pass some initial substitution tests * Add tests for fmt::to_string, fix some substitutions bugs * Pass initial unit tests for find_pattern_matches * Start on unit tests for pcg pattern * Pass initial test for find_pattern_matches * Fix small build issue in tests * Format * Sync tests in CI with tests in proj * Fix minor build errors in kernels and local-execution * Format * Remove outdated code * More outdated code removal * More cleanup, add test for sp decomposition * Pull apart containers.h * More sp testing and fixes * Break up graph algorithms.h * Pre- full SP algo commit * Add initial implementation and tests for cbc decomposition and inverse line graph * Pass test for get_inverse_line_graph * Add new multidigraph * Fix get_inverse_line_graph to return a MultiDiGraph instead of a DiGraph * Add tests for parallel and series reduction finding * Add really rough implementation of valdez sp decomposition * Fix local-execution build * Add implementations and tests for applying series/parallel reductions * Add transformer pcg prototype * Format * Clean up sp decomposition interface and tests * Format * Add comments for top-level substitutions functions, add proj doxygen support * Start sketching out substitutions code * Fix build errors * Add ability to permute node ids * Cleanup and start to test new substitutions code * Add test case for evaluate_substitution_output * Add naive isomorphism detection code * Add graph inputs to open dataflow graph isomorphism * Add input permutation to evaluate_substitution_output * Update based on review comments * Fix permute_node_ids * Add test for permute_input_ids * Add models makefile and test * Update * Pass test * Enhance Transformer implementation * Reflect review comments * Migrate over to mutable implementation of apply_substitution * Add fast isomorphism checking and an initial implementation of full substitution logic * Pass initial full substitutions test * Cleanup old isomorphism checking code * Fix post-merge bugs * [WIP] Save initial refactor * Fix broken pcg builder test * Format * Reorganize code and remove some outdated code pre-code-review * Format * Implement actual encorder decoder architecture * Remove duplicated definition * Update based on review * Update argument order * Address review comments * Address missed comment * Remove latex dependency to avoid CI out-of-disk-space * Format * Implement most of the shape inference and ComputationGraphBuilder support * Fix bug in LayerNorm shape inference tests, disable {?} doctest default * Fix transformer test * Format * Get initial export-model-arch binary building * Actually dump valid json from export-model-arch * Some minor polishing of export-model-arch * Add binary sp tree logic and start on sp decomposition of computation graphs * Fix sp decomposition export of transformer (a lot of cleanup now needed) * Flesh out export-model-arch CLI and features * Format * Add split_test model * Add single_operator model for testing * Add substitution-to-dot and export-model-arch build to CI * Cleanup generic_binary_sp_decomposition_tree * Format * Fix substitution-to-dot name in CI * Add testing for cli_get_help_message * Add testing for cli_parse * Add missing include in export_model_arch * Add basic test for cli_parse on raw argv * Rename serial-parallel -> series-parallel * Add a bunch of testing for new code * Add tests for computation graph sp decomposition * Format * Fix build error in export-model-arch --------- Co-authored-by: hsdfzhsdfz <[email protected]>
* Add interface for differentiating inputs and weights in CG/PCG * Format * Address Reyna PR comments * fix bugs from merge * Format
* inception v3 initial implementation * Add parallel shape inference for concat and pool2d * Format * Respond to PR comments * Fix model bugs * Update batch norm to match pytorch interface for inception v3 * Finishing touches for inception, re-add relu flag for batchnorm * Format * Document adaptive pool2d formula simplification --------- Co-authored-by: Pietro Max Marsella <[email protected]> Co-authored-by: Colin Unger <[email protected]>
* Add initial Candle Uno model PCG * Adapt candle uno model to existing code style * Change to use dtgen * Add shape inference and ComputationGraphBuilder support for Concat * Update based on review * Update following review * Add glorot normal initializer * Update * Update * Fix bugs created by merge, add candle_uno to export-model-arch * Format --------- Co-authored-by: Colin Unger <[email protected]>
* Add initial bert model structure * Update following review * Rename config * Add additional bert configs * Update based on reviewing * Added assert checks * Add error message for unsupported BertConfig.position_embedding_type * Format * fix typo * Add bert to export-model-arch * Format --------- Co-authored-by: Colin Unger <[email protected]>
* pass existing tests * unity algorithm builds * fmt * fix * refactor machine mapping * add unit tests * fmt * add more tests * fmt * fix * refactor get_optimal_machine_mapping a bit and improve the tests * remove debug codes * A lot of simplifying and modularizing of unity dp code * Get tests building again * Get all the new testcases working * Move over to ProblemTree/ResultTree framework for machine mapping * Settle on ProblemTree/BinaryTreePath-indexed-MachineMappingResult for machine mapping * More code cleanup and PR prep * Get tests building again * Pass some basic tests of get_optimal_machine_mapping * Migrate over to use type-erased binary tree * Move back to templated FullBinaryTree * Get all existing tests passing again * Fix tests and format * Move graph_optimize_state.cc to correct location --------- Co-authored-by: Mengdi Wu <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Colin Unger <[email protected]>
* containers helper functions * Additional support for unordered_multiset * format fix * Unordered Machine Mapping and adjacent changes * formatting * Minor fixes * Update to StridedRectangle interface * Minor updates * added get_allowed_machine_views * formatting * minor fix * Added StartInvariantMachineView * formatting * Containers fix * Implemented tensor to machine view injection * small refactor * formatting * Cleaning Up * Formatting fix * new machine-view interface * update to allowed machine views * PR review fixes * update to machine view and getting allowed machine view to match new interface * formatting * minor fix * PR fixes * PR fixes * machineview interface change * Minor PR fixes * .cc machine view fixes + added StartInvariantMachineView * minor PR fixes * minor fixes * Post-merge fixes * Format --------- Co-authored-by: Pietro Max Marsella <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Colin Unger <[email protected]>
* removed GraphInternal * minor changes * Added views.cc testing * Fixed containers testing * views.cc fix * test reorganizing * fmt * test rearranging * rearranging * graph fixes * undirected graph test fix * Updated docs * Test updates + bug fixes * minor fix * PR fixes * moved graph-testing to other PR * minor fixes * Remove unnecessary includes * Post-merge fixes * Small bugfix and remove some unnecessary includes --------- Co-authored-by: Pietro Max Marsella <[email protected]> Co-authored-by: Colin Unger <[email protected]> Co-authored-by: Colin Unger <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
compiler
,runtime
,op-attrs
,kernels
,pcg
,utils
, andsubstitutions
libraries. Onlyruntime
has an explicit legion dependencyModelSpec
) from PCG management (FFModel
)Tensor
,ParallelTensor
,Layer
, etc. construction into factory classes (TensorManager
,ParallelTensorManager
,LayerManager
)utils
's labelled graphstask_spec
interface for legion tasks (see https://github.com/flexflow/FlexFlow/blob/repo-refactor/lib/runtime/src/task_spec/README.md for documentation)DataTypeDispatch*
)compiler
Tensor
andLayer
to be immutabletensor_guid_t
,LayerID
,parallel_tensor_guid_t
, etc.)utils
, including new open graph types (see https://github.com/flexflow/FlexFlow/blob/repo-refactor/lib/utils/include/utils/graph/README.md for documentation)containers.h
) inutils
visit_struct
)stack_vector
,stack_map
,stack_string
)cmake
code to be more modern and concisedoctest
andrapidcheck
, also add dependencies ontl::expected
,fmt
,spdlog
Remaining Todos:
op-attrs
#819Related Issues:
Linked Issues:
Issues closed by this PR:
ElementUnary
#440MultiHeadAttention
#446LayerNorm
#445ParallelTensor
to not be a typedef'ed pointer #302fused.cpp
andfused.cu
#569layer_guid
with separateget_tensor
handling #304Before merging:
This change is