-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent OOB behaviour for wgpu #296
Conversation
// both wgpu and Dawn handle this by either returning dummy data or clamping the index | ||
// to valid bounds. This means it's harmless to use in a select. | ||
let out_item = out.item(); | ||
value = format!("select({out_item}(0), {value}, {ind} < {len})"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is going to work with vectorization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just with the syntax of vec4(0)? Or more that indexing needs to take vectorization into account? I'm really not sure how vecorization works, if you could add a test for it that'd be amazing!
927c3fe
to
1e29f78
Compare
|
||
let length = match lhs.has_buffer_length() { | ||
true => cube::Metadata::BufferLength { var: lhs }, | ||
false => cube::Metadata::Length { var: lhs }, | ||
}; | ||
|
||
instructions.push(self.compile_metadata(length, Some(array_len))); | ||
instructions.push(wgsl::Instruction::CheckedIndex { | ||
len: self.compile_variable(array_len), | ||
lhs: self.compile_variable(lhs), | ||
rhs: self.compile_variable(rhs), | ||
out: self.compile_variable(out), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary right now, but I think having this implemented with a cpa
kernel like CheckedIndexAssign
would be easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure when we don't have BufferLength
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to cpa someday.
And yeah this behaviour is copied from the CPP version. Personally I feel like that kind of thing should fail to compile, rather than creating a language with strange edge cases of UB. I'm not sure what kind of situation means you don't have buffer length
Currently, CubeCL relies on wgpu to have a consistent behaviour for OOB indexing (read-0, write-discard). While this so far has been true, it's not true in Dawn on WebGPU. The WebGPU specification defines an OOB as a "dynamic error" that might result in anything including program termination. In practice it's not that severe - but some backends do have differing behaviour like clamping the index instead of discarding a write.
This PR changes Cube to use an indexing mode much like cubecl-cpp, and insert checks manually. For performance, I still do a read-oob-0 by using a
select()
which hopefully compiles to a conditional move. This isn't quite correct as the WebGPU spec allows any kind of behaviour when indexing OOB, but in practice it can't do much besides picking a random in bound index, so this is practically safe.On Vulkan + Spir-V we're relying on Vulkan robustness, this doens't change anything there.
In the future we could also disable these checks on Vulkan with robustness when using WGSL (or any other platforms where this behavious is guaranteed).
This might unblock #211, as mentioned here: tracel-ai/burn#2435 metal had new issues as the OOB behaviour in WGPU changes.
Very much not my area so please have Genna sign this off!