Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inject desired pattern for handling Transpose for fp8 gemm rewrite #17440

Closed
wants to merge 9 commits into from

Conversation

wenscarl
Copy link
Contributor

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:

a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)

to

a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)

@elfiegg
Copy link
Contributor

elfiegg commented Sep 24, 2024

Also cc @akuegel

Copy link
Member

@akuegel akuegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add a test?

xla/service/gpu/transforms/gemm_rewriter.cc Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Outdated Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Outdated Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Outdated Show resolved Hide resolved
xla/service/gpu/transforms/gemm_rewriter.cc Show resolved Hide resolved
@akuegel
Copy link
Member

akuegel commented Sep 30, 2024

I don't know how to mark a conversation again as unresolved, but for visibility I am copying my answer to a resolved conversation here as well:

I am not sure whether I understand your argument correctly, but to me it seems you assume that RowMajor means that the XLA layout of the operand will be the default layout? Why can we assume that? And even if it turns out that we don't need bitcast-transpose-bitcast, because the operand already has default layout, AlgebraicSimplifier would simplify the bitcasts away. So my suggestion is to always generate the bitcast - transpose - bitcast sequence, and let AlgebraicSimplifier clean it up later. Alternatively, don't distinguish between whether the call was for the RowMajor->ColumnMajor swap or for the ColumnMajor -> RowMajor swap, and instead check whether the operand has default layout or not (and in case of default layout, use the if (!col_maj) code path. That would be cleaner, as we don't need the col_maj parameter, and we are checking exactly the precondition for whether we need bitcast-transpose-bitcast or a single transpose will be enough.

copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b479c21
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b479c2177672a0010ffba1630efdaec5ca4cee26
PiperOrigin-RevId: 680886834
Copy link
Member

@akuegel akuegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal testing revealed some issues. Please see my comments.

copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b479c21
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b479c2177672a0010ffba1630efdaec5ca4cee26
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b479c21
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b479c2177672a0010ffba1630efdaec5ca4cee26
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b479c21
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b479c2177672a0010ffba1630efdaec5ca4cee26
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b479c21
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose 824ac54
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose 824ac5425f1529326086c86f1cc7f31eee1fee9b
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose 824ac54
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose 824ac54
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

--
b633184 by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b633184
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

--
b63318487153a8668b9f95574b054b0129194c0c by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b63318487153a8668b9f95574b054b0129194c0c
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

--
b633184 by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b633184
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

--
b633184 by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b633184
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

--
b633184 by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b633184
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

--
b63318487153a8668b9f95574b054b0129194c0c by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b63318487153a8668b9f95574b054b0129194c0c
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

--
b633184 by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b633184
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR #17440

Related to #17276 and #16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c032 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd69 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9 by shuw <[email protected]>:

clang format

--
ad0a4ba by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d by Shu Wang <[email protected]>:

Remove uncessary space.
--
7837845 by Shu Wang <[email protected]>:

Update unittest.

--
b479c21 by shuw <[email protected]>:

Improve TransposeMatrix

--
b633184 by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=#17440 from wenscarl:fp8_regulate_transpose b633184
PiperOrigin-RevId: 680886834
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

--
b63318487153a8668b9f95574b054b0129194c0c by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#17440 from wenscarl:fp8_regulate_transpose b63318487153a8668b9f95574b054b0129194c0c
PiperOrigin-RevId: 680886834
Shape normalized_input_shape =
ShapeUtil::MakeShapeWithDescendingLayoutAndSamePhysicalLayout(
input_shape);
auto a0 = MakeBitcastHlo(instr, normalized_input_shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be conceptually simpler to insert a copy before the transpose to change the layout of the input? (This assumes that the copy -> transpose sequence is optimized by another pass which I haven't verified.)

Also, can we pick a more descriptive variable name here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass runs after layout normalization which turns copies into bitcast + transpose, so it should not produce any Copy ops that change the layout (otherwise we would have to run layout normalization again).

@copybara-service copybara-service bot closed this in 7d4740a Oct 2, 2024
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 2, 2024
… rewrite

Imported from GitHub PR openxla/xla#17440

Related to openxla/xla#17276 and openxla/xla#16975.
This PR updates the GemmRewriter to handle the transpose of non-descending layouts directly, eliminating the need for the layout_normalization pass to correct this error-prone pattern post-rewrite. The desired transformation is now injected into GemmRewriter, ensuring the problematic layout is handled internally. This PR transforms the following error-prone pattern, where the transpose of a non-descending layout is the issue:
```
a = f8e4m3fn[x,y]{0,1} xxx
transpose.0 = f8e4m3fn[y,x]{0,1} transpose(a), dimensions=(1,0)
custom-call(a,...)
```
to
```
a = f8e4m3fn[x,y]{0,1} xxx
bt = f8e4m3fn[y,x]{1,0} bitcast(a)
transpose.1 = f8e4m3fn[x,y]{1,0} transpose(bt), dimensions=(1,0)
bt.1= f8e4m3fn[y,x]{0,1} bitcast(transpose.1)
custom-call(bt.1,...)
```
Copybara import of the project:

--
237c03240da3dce736d92c8273dc1f9d3be53af5 by shuw <[email protected]>:

Improve TransposeMatrix

--
508cd6928bbc20c1d87818eed4ee6190c6c9f691 by Shu Wang <[email protected]>:

Fix bug of permutation.
--
c55e8a9f64c8dac69907ccebce3b8109ddeb2c48 by shuw <[email protected]>:

clang format

--
ad0a4ba8054092dd79608865a823c1d432f81b21 by Shu Wang <[email protected]>:

Add unittest.
--
1d45b4d64347c64a9483fd26caf7d8598818b855 by Shu Wang <[email protected]>:

Remove uncessary space.
--
78378455e70e439e71da078c3099732a14292d7d by Shu Wang <[email protected]>:

Update unittest.

--
b479c2177672a0010ffba1630efdaec5ca4cee26 by shuw <[email protected]>:

Improve TransposeMatrix

--
b63318487153a8668b9f95574b054b0129194c0c by Shu Wang <[email protected]>:

Update unittest shape and BUILD file.

Merging this change closes #17440

PiperOrigin-RevId: 681551009
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants