Consistent OOB behaviour for wgpu #296

ArthurBrussee · 2024-11-22T18:10:43Z

Currently, CubeCL relies on wgpu to have a consistent behaviour for OOB indexing (read-0, write-discard). While this so far has been true, it's not true in Dawn on WebGPU. The WebGPU specification defines an OOB as a "dynamic error" that might result in anything including program termination. In practice it's not that severe - but some backends do have differing behaviour like clamping the index instead of discarding a write.

This PR changes Cube to use an indexing mode much like cubecl-cpp, and insert checks manually. For performance, I still do a read-oob-0 by using a select() which hopefully compiles to a conditional move. This isn't quite correct as the WebGPU spec allows any kind of behaviour when indexing OOB, but in practice it can't do much besides picking a random in bound index, so this is practically safe.

On Vulkan + Spir-V we're relying on Vulkan robustness, this doens't change anything there.

In the future we could also disable these checks on Vulkan with robustness when using WGSL (or any other platforms where this behavious is guaranteed).

This might unblock #211, as mentioned here: tracel-ai/burn#2435 metal had new issues as the OOB behaviour in WGPU changes.

Very much not my area so please have Genna sign this off!

nathanielsimard · 2024-11-22T19:45:43Z

crates/cubecl-wgpu/src/compiler/wgsl/instructions.rs

+            // both wgpu and Dawn handle this by either returning dummy data or clamping the index
+            // to valid bounds. This means it's harmless to use in a select.
+            let out_item = out.item();
+            value = format!("select({out_item}(0), {value}, {ind} < {len})");


Not sure this is going to work with vectorization.

Just with the syntax of vec4(0)? Or more that indexing needs to take vectorization into account? I'm really not sure how vecorization works, if you could add a test for it that'd be amazing!

nathanielsimard · 2024-11-23T16:52:40Z

crates/cubecl-wgpu/src/compiler/wgsl/compiler.rs

+
+                    let length = match lhs.has_buffer_length() {
+                        true => cube::Metadata::BufferLength { var: lhs },
+                        false => cube::Metadata::Length { var: lhs },
+                    };
+
+                    instructions.push(self.compile_metadata(length, Some(array_len)));
+                    instructions.push(wgsl::Instruction::CheckedIndex {
+                        len: self.compile_variable(array_len),
+                        lhs: self.compile_variable(lhs),
+                        rhs: self.compile_variable(rhs),
+                        out: self.compile_variable(out),
+                    });


I don't think it's necessary right now, but I think having this implemented with a cpa kernel like CheckedIndexAssign would be easier.

I'm also not sure when we don't have BufferLength

+1 to cpa someday.

And yeah this behaviour is copied from the CPP version. Personally I feel like that kind of thing should fail to compile, rather than creating a language with strange edge cases of UB. I'm not sure what kind of situation means you don't have buffer length

crates/cubecl-wgpu/tests/unary_bench.wgsl

nathanielsimard reviewed Nov 22, 2024

View reviewed changes

ArthurBrussee added 4 commits November 22, 2024 23:00

Safe indexing in wgsl

742430f

Fix indexing op

072474b

Spirv shader, update comment

b26b183

Merge fix

1e29f78

ArthurBrussee force-pushed the safe-ind branch from 927c3fe to 1e29f78 Compare November 22, 2024 23:00

ArthurBrussee added 3 commits November 22, 2024 23:06

Fix atomics, fix spirv

eb327a7

Update references

7b80c7a

Add tests

b022db7

nathanielsimard reviewed Nov 23, 2024

View reviewed changes

nathanielsimard approved these changes Nov 25, 2024

View reviewed changes

nathanielsimard merged commit f0154bf into tracel-ai:main Nov 25, 2024
5 checks passed

ArthurBrussee deleted the safe-ind branch November 25, 2024 19:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent OOB behaviour for wgpu #296

Consistent OOB behaviour for wgpu #296

ArthurBrussee commented Nov 22, 2024

nathanielsimard Nov 22, 2024

ArthurBrussee Nov 24, 2024

nathanielsimard Nov 23, 2024

nathanielsimard Nov 23, 2024

ArthurBrussee Nov 24, 2024

Consistent OOB behaviour for wgpu #296

Consistent OOB behaviour for wgpu #296

Conversation

ArthurBrussee commented Nov 22, 2024

nathanielsimard Nov 22, 2024

Choose a reason for hiding this comment

ArthurBrussee Nov 24, 2024

Choose a reason for hiding this comment

nathanielsimard Nov 23, 2024

Choose a reason for hiding this comment

nathanielsimard Nov 23, 2024

Choose a reason for hiding this comment

ArthurBrussee Nov 24, 2024

Choose a reason for hiding this comment