Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cute gemm dispatch-5 when C != D #1618

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sjfeng1999
Copy link
Contributor

No description provided.

@thakkarV
Copy link
Collaborator

thakkarV commented Jul 9, 2024

I think this is not a bug. cute::gemm() does not have a way of telling what the first MMA of you sequence is. Whatever collective invokes this gemm is responsible for knowing what the first MMA in the sequence of all tiled MMAs is and zeroing out the D registers before the first MMA. At the scope of a single gemm() dispatch, we cannot know whether k==0 is the first MMA of the sequence or not.

@thakkarV thakkarV closed this Jul 9, 2024
@sjfeng1999
Copy link
Contributor Author

As I understand it, cute::gemm() computes a gemm with static problem shape (V, M, N, K).
$$D_{mn} = \sum_k A_{mk} * B_{nk} + C_{mn} $$
It is irrelevant with whether it is part of an outer gemm loop or not and whether C and D are zero-initialized.
For example, if we have K = 2, the origin code

CUTE_UNROLL
  for (int k = 0; k < K; ++k) {
    gemm(mma, D, A(_,_,k), B(_,_,k), C);
}

will be unrolled to

gemm(D, A(_, _, 0), B(_, _, 0), C); // Dmn = Am0 x Bn0 + Cmn 
gemm(D, A(_, _, 1), B(_, _, 1), C); // Dmn = Am1 x Bn1 + Cmn 

the result of K=1 will overwrite the result of K=0. This is only true when C = D which is what we do in the CollectiveMainloop (Accum = Amk x Bnk + Accum). But it is incorrect in general case.

Look forward your reply, thks.

@sjfeng1999
Copy link
Contributor Author

@thakkarV

Copy link

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants