-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPU support to MRIReco.jl #182
Conversation
And here is the first GPU reconstruction done with the changes in commit 8ecdc66. The left column shows the phantom, the middle column the CPU result and the right column the GPU result. |
Do you know a way to tests the GPU reconstruction during the CI ? I think @cncastillo is also trying to test KomaMRI with a different CI setup for the GPU part |
There are JLArrays, which are a reference implementation of GPU arrays that run on the CPU. If the code remains GPU agnostic, we can have one test-suite running on "normal" arrays and then again on JLArrays. This is the approach we aim for in RegularizedLeastSquares at the moment. One issue with that approach is that we need to make all the operators work with JLArrays. In particular, we just call plan_fft and plan_nfft with a given array type and I don't think either work for JLArrays. But I can implement a workaround for that in LinearOperatorCollection. I had to do a similar workaround for the WaveletOp, since Wavelets don't work on the GPU I also saw this in the Julia #gpu Slack: https://github.com/JuliaGPU/buildkite. As far as I understood it registered Julia package that belong to an organization can apply to run on their CI. |
Hi! Indeed, to use JuliaGPU BuildKite CI, your package should be a part of a Julia organization. We recently moved KomaMRI to JuliaHealth for this, and besides some broken links and a few minor issues (JuliaHealth/KomaMRI.jl#335), it was straightforward.
We are planning on testing Koma on multiple GPUs anyway, so we will do some tests like the ones you require (here https://github.com/JuliaHealth/KomaMRI.jl/blob/master/test/runtests.jl). As we use MRIReco, we are happy to do the tests for you, but we haven't set up the Builkite CI config yet 😄. In the meantime, you can create a PR in Koma pointing to the correct version of MRIReco (this PR), adding Having said that, Koma's current integration with MRIReco uses RawAquisitionData as an intermediate. The MRD format inspired this custom type, so the data is forced to a specific type (which makes sense for writing to MRD). It would be nice if the GPUArrays could flow freely from the simulations to the recon; I think the main culprits are the mutable struct Profile
head::AcquisitionHeader
traj::Array{Float32,2} #<- allow AbstractArray?, but then force to F32 when writting
data::Array{Complex{Float32},2} #<- allow AbstractArray?, but then force to CF32 when writting
end Something similar for Cheers! |
Getting access to BuildKite over KomaMRI.jl sounds a like great idea! I'll look into that once everything is setup here. I'll still probably setup the JLArray test case for the local tests. I saw that LinearOperators.jl had a CI configuration where a maintainer could trigger breakage tests with upstream packages. Something like that would be a good fit for your idea @cncastillo. Then we can test it for the commits where it really matters. I've also thought about extending my changes to the base, data and file handling packages, but for this PR I'll stick close to a refactoring of the Reco and operator packages with small changes in the other ones if necessary. For now I just want to get all the MRIReco tests and the MRIReco benchmark running on a GPU. Once we have that running we have a good idea of what structures and information we need to run things on the GPU and then we can look into more PRs that potentially do GPU work or just more generic array compatibility in the simulation and "base" packages |
Closes #168 |
This PR is now ready to be merged. It adds GPU support to MRIReco.jl and MRIOperators.jl. It also required smaller fixes/dependency updates to MRIBase.jl and MRISimulations.jl. To run a reconstruction on the GPU a user needs to supply the arr = zeros(ComplexF32, 32, 32)
# 1. Convert existing array
arr_gpu = foo(arr)
# 2. Create vector with given eltype and undef init.
v = foo{ComplexF32}(undef, 0) "Normal" arrays also fulfil this interface. We need the first function to move things to the GPU and the second one we need to make sure that our operators have the correct storage type. When we execute a reconstruction on the GPU we need to slightly change the reconstruction and its parameters. While at least for CUDA it seems like each tasks gets its own CUDA stream/context, we generally want to disable the multi-threaded reco loops. This is possible for executor(foo) For AbstractGPUArrays, this function returns a sequential execution. For "custom" callable objects this would need to be implemented. Since we execute the loop sequentially, we don't need to One last change based on GPU arrays are the parameters for the FFT. Generally, GPU plan_fft calls don't accept the same parameters (or any) as the CPU version. This is again handled by a function I've also added GPU-support to NFFT, which is implemented as a weak package extension. That should mean that AMD cards can also use the NFFT operator. I currently don't have access to one to test this and I can't use buildkite to fully test this until this PR . Once this is merged we can comment out the buildkite steps I prepared. As @cncastillo mentioned, the next steps could be to include GPU support into the acquisition data and generally the other packages. One could derive the arrayType then from the data itself |
This is great news :). The current way of choosing the backend is a little weird, is this a limitation of LinearOperators.jl? In case it is useful, @rkierulf and I worked on some functions, using KomaMRICore
using CUDA # Loads KomaMRICoreCUDAExt, and set default backend to CUDA
KomaMRICore.set_device!(1) # Not needed if you want to run it on the first (maybe only) CUDA device
x = zeros(10) |> f32 |> gpu # CuArray{Float32}, `f64` is also available but not all the backends support it
y = myfun(x) |> cpu # Array{Float32} All the logic for these is very lightweight, and they do not have anything necessarily "Koma"-related, so they could be moved to MRIBase or a new package. The only thing required to convert a new custom type would be to tag it with
Also, instead of using |
Hmm choosing the device and backend like that would be cool. And when we do that for the acq. data structs we could properly use adapt. On the reco level we kinda only have the dictionary. Internally, we would still need to derive this callable function/arraytype in some fashion for the operators to work. And we need the first variant to move our arrays to the gpu, however I just remembered I haven't done a pass over my code with Adapt.jl, so I'll do that before it is merge ready. We can't use a concrete datatype with adapt, the function wants |
Ah we can't use a view because for CPU operations we actually want to get copies of the operators because they are stateful and not thread-safe |
At least using KernelAbstractions.jl, there are some functions like |
We can derive that from a dense GPU array of the acquisition data using Base, so that's covered once we we have that. MRIReco can also add KernelAbstractions as a dependency or your backend-selection-solution. In MPIReco.jl I moved to KernelAbstractions.jl because I needed to use shared memory which I couldnt get to work with GPUArrays.jl. We could also port the kernels here, I just didnt have the time/it wasnt necessary so far |
This PR adds GPU support to MRIReco.jl and the relevant subpackages. It is based on the following PRs: LinearOperators.jl#321, RegularizedLeastSquares.jl#81 and LinearOperatorCollection.jl#9
Likewise to the other PRs, I will try to keep these changes GPU-agnostic and document my high-level idea and relevant changes in the PR description.
To work on the GPU, we need all LinearOperators to have a GPU storage type. I've started adding the relevant
S
kwarg to the operators in the collection package and in MRIOperators.jl. Additionally, we need our solver input and dense arrays to also live on the GPU if possible. To allow for this I have added a new :arrayType (name might change) parameter to the parameter Dict, from which on can then construct correct arrays and vectors.I'll first to keep all changes close to the Reco and operator packages. That in turn means that we need to convert the acquisition data everytime during reconstruction. An alternative could be to specifc the backing array type during data loading and derive it from there