You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to compute the adjoint (A') of a Diagonal backed by a CuArray reverts to scalar indexing.
To reproduce
The Minimal Working Example (MWE) for this bug:
using CUDA, LinearAlgebra
x =CuArray(rand(ComplexF32,5))
A =Diagonal(x)
A'# Errors
This will work in a REPL, but not running in VSCode (for some reason). The problem becomes more obvious when you try to use the result in computation, like a matmul with a dense array (this might also mean that diagonal multiplication has the issue, but I'm having difficulty determining that).
y =CuArray(rand(ComplexF32,5,5))
y * A'# Errors
The adjoint in other contexts seems to work fine
x * x'# Works fine
y * y'# Works fine
And multiplication with non-adjointed diagonals work fine as well
This is the well-known issue of multiple array wrappers 'breaking' method dispatch, so I think this can be closed in favor of JuliaGPU/Adapt.jl#21.
And FWIW:
This will work in a REPL, but not running in VSCode (for some reason).
It doesn't 'work' in the REPL, it's just that we allow scalar iteration with a warning in an interactive session. It looks like VSCode isn't recognized as such, resulting in scalar iteration generating a hard error.
Since it has proven tricky to solve this from the Julia side (see the linked issue, which point to a couple of Base PRs that have stranded) I'm considering switching to unified memory by default such that scalar fallbacks at least perform somewhat decently, but they still wouldn't be executing on the GPU.
Describe the bug
Trying to compute the adjoint (
A'
) of aDiagonal
backed by aCuArray
reverts to scalar indexing.To reproduce
The Minimal Working Example (MWE) for this bug:
This will work in a REPL, but not running in VSCode (for some reason). The problem becomes more obvious when you try to use the result in computation, like a matmul with a dense array (this might also mean that diagonal multiplication has the issue, but I'm having difficulty determining that).
The adjoint in other contexts seems to work fine
And multiplication with non-adjointed diagonals work fine as well
Manifest.toml
Expected behavior
For Adjoints of diagonal arrays to match CPU behavior.
Version info
Details on Julia:
Details on CUDA:
The text was updated successfully, but these errors were encountered: