-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some additional kernel thread index refactoring. #14107
Some additional kernel thread index refactoring. #14107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixups!
assert(start_idx < num_states); | ||
|
||
curandState localState = state[start_idx]; | ||
|
||
for (size_type idx = start_idx; idx < build_tbl_size; idx += stride) { | ||
for (thread_index_type tidx = start_idx; tidx < build_tbl_size; tidx += stride) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a range-based for loop to replace this pattern?
for(auto const tidx: grid_1d::thread_indices()) {...}
So we won't need to have stride
like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I'm going to leave that out of scope for this PR. I want to tighten the scope here and merge this since it's fairly old. Please feel free to comment on #10368 -- the topic of a range-based loop has already come up there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
…ad-index-refactors
I pushed a couple small fixes for benchmark builds (I hadn't run that locally yet). This should be ready to merge after CI passes. |
/merge |
Description
This PR refactors a few kernels to use
thread_index_type
and associated utilities. I started this before realizing how much scope was still left in issue #10368 ("Part 2 - Take another pass over more challenging kernels"), and then I stopped working on this due to time constraints. For the moment, I hope this PR makes a small dent in the number of remaining kernels to convert to usingthread_index_type
.Checklist