[BACKEND] Support `convert_layout` with `num_ctas > 1` Using Linear Layout #4782

Jokeren · 2024-09-23T16:11:35Z

Particularly, this PR implements layout conversion when a CGA contains more than one CTA. In such cases, a Triton tensor is split into multiple blocks, with each block being handled by a CTA.

block0 | block1
----------------
block2 | block3

If data transfer is required from block0 to block3, this PR cannot handle it, and we use isCrossCTAConversion to check this condition.

Jokeren · 2024-10-02T02:15:03Z

pin @lezcano because he expressed interests in reviewing LL related PRs.

ThomasRaoux

Looks good, thanks!

ThomasRaoux · 2024-10-02T22:48:26Z

lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp

@@ -926,6 +937,7 @@ std::optional<LinearLayout> chooseStMatrixLayoutNoLeadingOffset(
  StringAttr kWarp = S("warp");
  StringAttr kCol = S("dim1");
  StringAttr kRow = S("dim0");
+  StringAttr kBlock = S("block");


bikeshedding: I hope the name is explicit enough. Block term is way overloaded unfortunately. We could name it CTA but that's very nvidia specific.

Sure, I had a try to change it to CTA but found it a bit annoying to change a lot of places. Let me sort it out in the future.

…ayout (triton-lang#4782) Particularly, this PR implements layout conversion when a CGA contains more than one CTA. In such cases, a Triton tensor is split into multiple blocks, with each block being handled by a CTA. ``` block0 | block1 ---------------- block2 | block3 ``` If data transfer is required from block0 to block3, this PR cannot handle it, and we use `isCrossCTAConversion` to check this condition.

Jokeren and others added 16 commits September 22, 2024 22:43

Update

cf1f0f0

Update

a787531

Update

3d6aef6

Update

21c2077

Update

481d768

Update

ef075b1

Update

325a0cd

Update

b663ed6

Update

024dde2

Update

6c0934a

Update

af4e053

Update

4720c19

Update

95744f5

Update

e838315

Merge branch 'main' into keren/num-blocks

d8efbe5

Update

93089ee

Jokeren marked this pull request as ready for review October 2, 2024 02:14

Jokeren requested a review from ptillet as a code owner October 2, 2024 02:14

Jokeren requested review from lezcano and ThomasRaoux October 2, 2024 02:14

ThomasRaoux approved these changes Oct 2, 2024

View reviewed changes

Jokeren merged commit 112b88d into main Oct 3, 2024
7 checks passed

Jokeren deleted the keren/num-blocks branch October 3, 2024 01:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACKEND] Support `convert_layout` with `num_ctas > 1` Using Linear Layout #4782

[BACKEND] Support `convert_layout` with `num_ctas > 1` Using Linear Layout #4782

Jokeren commented Sep 23, 2024 •

edited

Loading

Jokeren commented Oct 2, 2024

ThomasRaoux left a comment

ThomasRaoux Oct 2, 2024

Jokeren Oct 3, 2024

[BACKEND] Support convert_layout with num_ctas > 1 Using Linear Layout #4782

[BACKEND] Support convert_layout with num_ctas > 1 Using Linear Layout #4782

Conversation

Jokeren commented Sep 23, 2024 • edited Loading

Jokeren commented Oct 2, 2024

ThomasRaoux left a comment

Choose a reason for hiding this comment

ThomasRaoux Oct 2, 2024

Choose a reason for hiding this comment

Jokeren Oct 3, 2024

Choose a reason for hiding this comment

[BACKEND] Support `convert_layout` with `num_ctas > 1` Using Linear Layout #4782

[BACKEND] Support `convert_layout` with `num_ctas > 1` Using Linear Layout #4782

Jokeren commented Sep 23, 2024 •

edited

Loading