Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于 hw3 中的 cuda matmul 优化, grid 与 block 是否写反? #4

Open
wplf opened this issue May 8, 2024 · 0 comments
Open

Comments

@wplf
Copy link

wplf commented May 8, 2024

在调用 matmul kernel 时, 您的代码的 grid = (256, 256, 1), block 是 ( ceil(M/256), ceil(P/256), 1 )
这两个代码变量是不是写反了, 一般情况是 block 设置线程数, 而 grid 设置 有多少个block 数,但在您的程序中刚好相反。

  /// BEGIN YOUR SOLUTION
  dim3 grid(BASE_THREAD_NUM, BASE_THREAD_NUM, 1);
  dim3 block((M + BASE_THREAD_NUM - 1) / BASE_THREAD_NUM, (P + BASE_THREAD_NUM - 1) / BASE_THREAD_NUM, 1);
   MatmulKernel<<<grid, block>>>(a.ptr, b.ptr, out->ptr, M, N, P);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant