PR #18838: [NVIDIA GPU] Support multi-operand collective-permute #19424
+653
−161
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR #18838: [NVIDIA GPU] Support multi-operand collective-permute
Imported from GitHub PR #18838
For collective-permutes with small message sizes, it is beneficial to combine them into a single collective because
In order to support combining collective-permutes, we need to support multi-operand collective-permute first, a.k.a. the combined collective-permute. This PR extends the existing CP interface by overloading it, so that a CP can have multiple operands.
Copybara import of the project:
--
5e10aba by Terry Sun [email protected]:
support multi-operand cp
--
170fead by Terry Sun [email protected]:
minor refactoring
--
0d85070 by Terry Sun [email protected]:
update python interface
--
9812a10 by Terry Sun [email protected]:
polish python interface
--
3a1552c by Terry Sun [email protected]:
formatting
--
d3657f8 by Terry Sun [email protected]:
formatting
--
c9202fa by Terry Sun [email protected]:
fix minor issues
Merging this change closes #18838
FUTURE_COPYBARA_INTEGRATE_REVIEW=#18838 from terryysun:terryysun/grouped_cp c9202fa