-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for gpu_spreadinterponly
=1
#564
Support for gpu_spreadinterponly
=1
#564
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear Chaithya,
Thanks for the tests! This PR all seems very reasonable, and is a very light change to the API. I'm so glad that you are able to use our option to switch off FFTs. I see you also skipped the kernel fseries calc, a good idea. It's great you added the error code to the docs, but my main request is to add something about upsampfac=1.0 in the GPU docs, which is a new use case to us! :)
What I still don't understand is how you use our spreadinterp, specifically, what if we were to change our kernel coefficients (very possible, since we already rescaled them in going from v2.2 to v2.3) - how will the density compensation user know? Won't it break your math? Do you access our kernel eval yourself (by bypassing our API) ?
By the way, I really like your new docs at MRI-nufft, with the formulae, NUFFT usage, etc. Very useful! (I have only skimmed so far).
Assuming others are happy, we can bring this in in a matter of days. I'd be willing to consider a CPU version of the above too, if it helps you and MRI people.
Best, Alex
I should add two main questions on the topic of your tests:
Thanks, Alex |
Ill detail the perftest based on old and new cufinufft soon, need to dig up some results. gpuNUFFT was faster than cufinufft in the past for 3D multi-coil from what I remember. The tests was on A100 GPU I think, need to check the exact details and update here.
Thank you for your interest. While it is collaborative effort, I must say majority of this work, formulation and documentation was by @paquiteau (also the original setup for the benchmark). |
Another note to clean up the PR: currently opts.gpu_spreadinterponly is listed in docs as "diagnostic" and only to be used as such, and that it produces "garbage" (!). This PR should move it to be a "Data handling option", and have the MRI use case with upsampfac=1.0 mentioned. PS Thanks @paquiteau for the tests and great new docs here: |
I will review next week when I have time. I read the conversation I think having this option in finufft is a good idea. In that case we should spend some time thinking how to reduce code duplication in a similar fashion as cufinufft. |
Hello there ! Thanks @ahbarnett, for the positive feedback on our docs; it really is a team effort. Feel free to open any issue on our side if something is unclear or missing (for instance, the explanation of the how and why of density compensation is still missing)
That's the best part: it won't break anything, it fixes everything :) density compensation reweights the non-uniform points with respect to how they are processed together in the NUFFT Regarding the benchmark, hopefully, you can run it on your hardware as well: https://github.com/mind-inria/mri-nufft-benchmark/ It is designed to stress test the NUFFT in challenging MRI applications (high spatial resolution in 3D notably). We did not configure much of the parameters either (e.g.,
cufinufft has always been faster in 2D; for 3D, it was more subtle; the multi-coil computations of gpunufft (which uses async copy and computes) were putting it ahead, IIRC . Nevertheless, gpuNUFFT remains more memory efficient as we don't have to create two separate plans for type 1 and type 2 like in cufinufft |
I didnt have to add this change on this PR as it seemed like someone had done it already: finufft/src/cuda/memtransfer_wrapper.cu Lines 347 to 364 in b98fd1d
It seems like we don't allocate fw, fkernel etc if we have |
This PR breaks type3, before I review could you fix it? Does this behaviour conflict with your use case? |
Have you installed finufft from pip? Did you try building from source? -march=native and gpu native optimizations might change the performance drammatically. |
@DiamonDinoia I think I am good with this PR, ill let you review it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Some minor changes and cleanup.
Is there something left on my side on this PR? Just want to be sure that you are not waiting on me for anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. Very clean in the end. Thx for adding the docs (I may tweak them). Best, Alex
This is still a mystery to me - precisely, how your MRI application knows what kernel we use for spreadinterp, without accessing our spreading function another way (I could guess it does a single-NU-point call to extract the kernel first?). I'm about to bring in the PR... |
SInce the logic is tricky here (XNOR; t3, USF=1, etc) and t3 broke and was fixed, I want to leave the individual commits. |
Thank you for the merge.
Well the density is estimated by a spread (S) followed by interp (I) call on all ones: I(S(1)). This way, we get an approximate estimate of how the spreading / interpolate kernels scale / weight each frequency element for a given sampling pattern. This can later be used iteratively to calculate how to "compensate" for it. Here is the paper describing in detail: https://pubmed.ncbi.nlm.nih.gov/21688320/ (Eq 2). PS: Do you have an apporximate idea on when we can get this PR to a release? If you anticipate it to take longer, Would a minor release be possible so that we can update |
@ahbarnett do you plan a short release with this feature anytime soon? It could really help us to have say v2.3.1 with this feature, as it would streamline mri-nufft's use of it. |
@chaithyagr Sorry for the delay on this, but I've cherry-picked your changes onto our v2.3.X branch (see here) in preparation for our 2.3.1 release. (The reason is that the main branch currently contains a lot of other changes, such as type 3 and a major reorganization of the code, that we're not quite ready to release yet). Since there are no unit tests for this new feature, would you mind running your tests on this branch to make sure I've transferred all your changes properly? Even better, if you could supply a unit test of some sort, that would help us out greatly in the future to make sure things remain stable. |
Hey @janden , thank you for this. We will get back soon with updates. |
@janden it seems to work good. Please do let us know if you make a release. I will make a PR in future testing it. |
Great, thank you! May I ask how you test it and if it is possible to write a unit test for it? I am worried about future changes breaking the code. |
Iam in process of writing same feature in finufft and possibly adding some
tests to it. Sadly the tests are not very rigorous at the moment and need
to be run from mri-nufft
…On Thu, 21 Nov, 2024, 4:57 pm Marco Barbone, ***@***.***> wrote:
Great, thank you! May I ask how you test it and if it is possible to write
a unit test for it?
I am worried about future changes breaking the code.
—
Reply to this email directly, view it on GitHub
<#564 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBSZMZGQ4N2EHIDAXJXD7T2BX7GZAVCNFSM6AAAAABOR2TWAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJRGYYTSNZWHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @janden Thanks for drafting 2.3.X branch. For Chaithyagr's commits to make sense we'll also need to choose the following doc commits I did: I am a bit confused about the route then for what master is updated to in terms of version number - it can't be 2.3.anything since that's a different track from 2.4.0. So is it 2.4.0beta? |
Dear Chaithyagr, Sorry, I had to also cherry-pick the docs to go with your changes. This also changed cufinufft/impl.h (just spacing/comment changes, but you should re-pull and check). We would only run a unit test for this option if it didn't have mri-nufft dependency. The best would be a simple python file to check the pure-spreading works as you intend it to. If you're ok with the above - I will make a tag for it - we'll make a pypi release 2.3.1. Thanks! Alex |
Dear @janden and @chaithyagr - Between us, Joakim & I made |
Hi there,
Thank you again for the amazing work. We re-ran our MRI benchmark for different nufft suits, and we are pleased to announce that cufinufft is the fastest NUFFT in the west!, and by a wide margin. Below are our benchmark results:
2D:
3D:
However, a major reason not to use it (as discussed in the past in #295) was the lack of density compensation estimation, which needed only the spreader and interpolator.
We understand that it takes a lot of effort to maintain API ends for the spreader and interpolator for Julia, Python, and MATLAB, so my earlier PR was not taken up.
However, we recently saw an internal option in cufinufft opts,
gpu_spreadinterponly,
and felt we could exploit it to allow density compensation estimators on our side.This pull request implements just that (without exposing internal interfaces, unlike #308)
This could help the MR community forward in using cufinufft. We are open to discussing the implementation in detail and how we could further contribute to (cu)finufft.
To understand the advantage of using density compensation, please see our example (where we plan to add cufinufft eventually): https://mind-inria.github.io/mri-nufft/generated/autoexamples/GPU/example_density.html
Feel free to see our other examples and notice how we use density compensation extensively.