You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This would make code a lot easier to write, because when I have code like this:
# data_pointer is my data, with size count# BLOCK_SIZE is some power-of-two working set sizebase=BLOCK_SIZE*triton.language.program_id(axis=0)
mask=base+triton.language.arange(0, BLOCK_SIZE) <countdata=triton.language.load(data_pointer, mask=mask)
counts=triton.language.histogram(data, DATA_MAX_VALUE) # oops
The issue here is histogram will look at the entiredata tensor, even though only the elements that mask marks as inbounds were loaded. So it will go through and read BLOCK_SIZE elements, even if some of them are garbage. This is actually quite inconvenient to fix. If I can fit a sentinel value into my data I can fill the unmasked region with that, and then histogram will put all of those extra elements into that bucket. But finding such a value is difficult and often requires some extra effort regardless, because I will have to make my binned tensor larger than it needs to be to accommodate this "junk" bin and then figure out a way to get rid of it. It would be much nicer if histogram just took a mask directly like load and store do, so it only read the data I care about.
The text was updated successfully, but these errors were encountered:
This would make code a lot easier to write, because when I have code like this:
The issue here is histogram will look at the entire
data
tensor, even though only the elements thatmask
marks as inbounds were loaded. So it will go through and readBLOCK_SIZE
elements, even if some of them are garbage. This is actually quite inconvenient to fix. If I can fit a sentinel value into my data I can fill the unmasked region with that, and then histogram will put all of those extra elements into that bucket. But finding such a value is difficult and often requires some extra effort regardless, because I will have to make my binned tensor larger than it needs to be to accommodate this "junk" bin and then figure out a way to get rid of it. It would be much nicer if histogram just took a mask directly like load and store do, so it only read the data I care about.The text was updated successfully, but these errors were encountered: