-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory optimization algorithm #1523
base: repo-refactor
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 17 files at r1, all commit messages.
Reviewable status: 11 of 17 files reviewed, 10 unresolved discussions (waiting on @wmdi)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h
line 16 at r1 (raw file):
struct ICostEstimator { virtual CostMetric estimate_cost(OpCostEstimateKey const &) const = 0; virtual CostMetric estimate_cost(TensorSetMovement const &) const = 0;
Why does communication cost return a memory usage?
lib/compiler/include/compiler/machine_mapping/feasible_machine_mapping_result.struct.toml
line 16 at r1 (raw file):
[[fields]] name = "cost" type = "::FlexFlow::CostMetric"
Separate out into different result type
lib/compiler/include/compiler/machine_mapping/get_optimal_machine_mapping.h
line 23 at r1 (raw file):
MachineSpecification const &resources, MachineMappingConstraints const &constraints, MachineMemoryConstraints const &memory_constraints,
I think for now let's separate the memory optimization algorithm from the existing algorithm. There may be a bit of code duplication, but I don't think it should be too much as they can both use a lot of existing infrastructure, and it is likely to make things a lot more maintainable in the long run.
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml
line 2 at r1 (raw file):
namespace = "FlexFlow" name = "MachineMappingConfig"
Remove in favor of separate algorithm implementation
lib/compiler/include/compiler/machine_mapping/machine_mapping_result.h
line 27 at r1 (raw file):
¶llel_split_transformation); [[nodiscard]] MachineMappingResult parallel_combine(MachineMappingConfig const &config,
Add operations for a new result type rather than overriding these
lib/compiler/include/compiler/machine_mapping/machine_memory_constraints/machine_memory_constraints.struct.toml
line 2 at r1 (raw file):
namespace = "FlexFlow" name = "MachineMemoryConstraints"
Probably better to just make this a field on MachineSpecification
lib/compiler/src/compiler/cost_estimator/cost_metric.cc
line 10 at r1 (raw file):
/*memory=*/0, }; }
I'm not really convinced this is all that useful of a function?
lib/compiler/src/compiler/cost_estimator/cost_metric.cc
line 36 at r1 (raw file):
for (CostMetric const &cost : costs) { result = combine_cost_metrics_intra_device_sequential(result, cost); }
foldl1
from utils/containers
?
Code quote:
CostMetric result = zero_cost_metric();
for (CostMetric const &cost : costs) {
result = combine_cost_metrics_intra_device_sequential(result, cost);
}
lib/compiler/src/compiler/cost_estimator/cost_metric.cc
line 40 at r1 (raw file):
} CostMetric combine_cost_metrics_intra_device_parallel(CostMetric const &c1,
What does "intra_device" mean here?
Code quote:
intra_device
lib/compiler/src/compiler/cost_estimator/cost_metric.cc
line 51 at r1 (raw file):
for (CostMetric const &cost : costs) { result = combine_cost_metrics_intra_device_parallel(result, cost); }
foldl1
from utils/containers
?
Code quote:
CostMetric result = zero_cost_metric();
for (CostMetric const &cost : costs) {
result = combine_cost_metrics_intra_device_parallel(result, cost);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 26 files reviewed, 10 unresolved discussions (waiting on @lockshaw)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h
line 16 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why does communication cost return a memory usage?
Done.
lib/compiler/include/compiler/machine_mapping/get_optimal_machine_mapping.h
line 23 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I think for now let's separate the memory optimization algorithm from the existing algorithm. There may be a bit of code duplication, but I don't think it should be too much as they can both use a lot of existing infrastructure, and it is likely to make things a lot more maintainable in the long run.
Done.
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml
line 2 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove in favor of separate algorithm implementation
Done.
lib/compiler/include/compiler/machine_mapping/machine_memory_constraints/machine_memory_constraints.struct.toml
line 2 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably better to just make this a field on
MachineSpecification
Done.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## repo-refactor #1523 +/- ##
=================================================
+ Coverage 78.16% 78.41% +0.24%
=================================================
Files 860 866 +6
Lines 27994 28917 +923
Branches 770 784 +14
=================================================
+ Hits 21881 22674 +793
- Misses 6113 6243 +130
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main thing is some unit testing should be added--currently the test coverage for this PR is 3.5%.
Reviewed 2 of 17 files at r1, 22 of 22 files at r2, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @wmdi)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h
line 18 at r2 (raw file):
virtual float estimate_cost(TensorSetMovement const &) const = 0; virtual CostMetric estimate_cost_with_memory(OpCostEstimateKey const &) const = 0;
I don't think we need another function here--communication shouldn't return a memory usage, but returning memory usage for an operator is fine.
it would be nice to get a more descriptive name than CostMetric
, maybe something like OperatorCostMetrics
to specify it's just for operators and it contains more than one "metric"?
Suggestion:
virtual CostMetric estimate_cost(OpCostEstimateKey const &) const = 0;
virtual float estimate_cost(TensorSetMovement const &) const = 0;
lib/compiler/include/compiler/cost_estimator/cost_estimator.h
line 31 at r2 (raw file):
float estimate_cost(OpCostEstimateKey const &k) const; float estimate_cost(TensorSetMovement const &m) const; CostMetric estimate_cost_with_memory(OpCostEstimateKey const &k) const;
Suggestion:
CostMetric estimate_cost(OpCostEstimateKey const &k) const;
float estimate_cost(TensorSetMovement const &m) const;
lib/compiler/include/compiler/cost_estimator/cost_metric.h
line 7 at r2 (raw file):
#include <vector> namespace FlexFlow {
These operations feel like something algorithm-specific rather than relevant to cost_estimator--I have no problem with them existing, but maybe the file should be somewhere else? Would the new machine_mapping_with_memory
directory be appropriate?
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml
line 2 at r1 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
Done.
This file still seems to be here? Is it still necessary?
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.struct.toml
line 0 at r2 (raw file):
Rename to machine_mapping_with_memory_cache.struct.toml
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.struct.toml
line 2 at r2 (raw file):
namespace = "FlexFlow" name = "MachineMappingCacheWithMemory"
Suggestion:
name = "MachineMappingWithMemoryCache"
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.struct.toml
line 2 at r2 (raw file):
namespace = "FlexFlow" name = "MachineMappingResultWithMemory"
Suggestion:
ame = "MachineMappingWithMemoryResult"
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.struct.toml
line 20 at r2 (raw file):
[[fields]] name = "machine_mappings" type = "std::unordered_set<::FlexFlow::SingleMachineMapping>"
What does this set represent? What is the meaning behind a SingleMachineMapping
--is that a machine mapping for a single layer, or a single pareto-optimal solution, or something else entirely? SingleMachineMapping
likely needs a clearer name.
Code quote:
type = "std::unordered_set<::FlexFlow::SingleMachineMapping>"
lib/compiler/src/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.cc
line 12 at r2 (raw file):
std::optional<MachineMappingResultWithMemory> machine_mapping_cache_with_memory_load(
For all the functions here
Suggestion:
machine_mapping_with_memory_cache_load(
lib/compiler/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc
line 26 at r2 (raw file):
} MachineMappingResultWithMemory remove_non_dominating_machine_mapping_result(
Suggestion:
remove_non_pareto_optimal_machine_mapping_results
lib/compiler/test/src/compiler/machine_mapping/cost_estimator_for_test.h
line 32 at r2 (raw file):
CostMetric estimate_cost_with_memory(OpCostEstimateKey const &) const override; };
Suggestion:
struct TestCostEstimator : public ICostEstimator {
std::function<CostMetric(OpCostEstimateKey const &)> get_operator_cost;
std::function<float(TensorSetMovement const &)> get_communication_cost;
TestCostEstimator() = delete;
TestCostEstimator(decltype(get_operator_cost) const &get_operator_cost,
decltype(get_communication_cost)
const &get_communication_cost,
decltype(get_operator_cost_with_memory)
const &get_operator_cost_with_memory);
CostMetric estimate_cost(OpCostEstimateKey const &) const override;
float estimate_cost(TensorSetMovement const &) const override;
};
lib/compiler/test/src/compiler/machine_mapping/cost_estimator_for_test.h
line 54 at r2 (raw file):
std::unordered_map<TensorSetMovement, float> const &comm_cost_map, std::unordered_map<OpCostEstimateKey, CostMetric> const &op_cost_with_memory_map);
Suggestion:
CostEstimator make_fake_cost_estimator(
std::function<CostMetric(OpCostEstimateKey const &)> const &get_operator_cost,
std::function<float(TensorSetMovement const &)> const
&get_communication_cost);
CostEstimator make_fake_cost_estimator(
std::unordered_map<OpCostEstimateKey, CostMetric> const &op_cost_map,
std::unordered_map<TensorSetMovement, float> const &comm_cost_map);
Description of changes:
Related Issues:
Linked Issues:
Issues closed by this PR:
This change is