ENH: Implement Gaussian Process #188

jhlegarreta · 2024-05-05T15:01:44Z

Implement Gaussian Process.

yibeichan · 2024-05-17T19:01:49Z

I made a draft PR on @jhlegarreta 's fork here jhlegarreta#1
I updated calculate_angle_matrix in src/eddymotion/model/utils.py
but I realize that my PR has some conflicts that need to be merged due to the latest updates in nippers/eddymotion. It's probably too much work to merge such huge conflicts just for one function.
I explained to Jon why I updated the function so Jon can decide whether you want to adopt the changes (by copying it from the link above)

Regarding compute_multi_shell_covariance_matrix, I tried to work on it but had some problems

the input parameters in the current version of the function is not complete, for example nvals and bvecs needs to be defined
also, the current calculate_angle_matrix only works for single-shell data, so we need to update this function first and then call it in compute_multi_shell_covariance_matrix
also, make sure we are consistent on whether to use k as input in those covariance matrices or initiate k within the function.
Some parameter names are not consistent. For example, theta should be angle_mat, but we use different names across functions.

Thank you, @jhlegarreta, for explaining everything to me and getting me started on this work (which is different from what I've done before). I'm happy to contribute more in the future if needed.

jhlegarreta · 2024-05-27T01:06:07Z

To be rebased on main once PRs #198, #199 and #200 (and probably others where part of the features in this PR, e.g. compute_exponential_function, compute_spherical_function, compute_squared_exponential_function) are likely to be spread for easier/independent testing/development) are merged.

oesteban

I finally could have a look into this. I think we have generated some overlap with other efforts, so a first step would be to remove duplicate work (IMHO).

The only major issue I can see is that it looks like the model confuses responsibilities with the estimator at points.

Let's set the following goal to this PR:

Be able to produce one prediction (i.e., one missing orientation) for one voxel (having all the other b-vecs/vals). This should not bother about:

hyperparameter update
kernel
motion and EC parameters update

src/eddymotion/model/base.py

src/eddymotion/model/utils.py

test/test_model.py

jhlegarreta · 2024-06-08T22:25:15Z

Re #188 (review) agreed.

Have done a local copy of the old file to add a documentation file as time permits. For now,

This reduces the whole thing to making the prediction.
Have removed all duplicates that this had after having split the contents across multiple PRs (some already merged, others being reviewed, etc.)

Will improve things in subsequent commits (or separate PRs): actually use the DWI data, use bvec indexing, etc.

src/eddymotion/model/base.py

jhlegarreta · 2024-06-15T20:10:24Z

Have gone through the comments but I am hesitant about them:

Now I do see that we should not provide the diffusion signal to the __init__ method. At least not at this stage of the development.
Re ENH: Implement Gaussian Process #188 (comment) ENH: Implement Gaussian Process #188 (comment) I think here we are back to the discussion about our data: reconstructing the 3D volume in here means that the fit/predict method get the entire data, which is unfeasible given the size. As I see things, reconstructing the 3D volume should be done in the caller. Best would be to discuss this over a call.

oesteban · 2024-06-16T16:03:26Z

Now I do see that we should not provide the diffusion signal to the __init__ method. At least not at this stage of the development.

👍

Re #188 (comment) #188 (comment) I think here we are back to the discussion about our data: reconstructing the 3D volume in here means that the fit/predict method get the entire data, which is unfeasible given the size. As I see things, reconstructing the 3D volume should be done in the caller. Best would be to discuss this over a call.

Happy to have a call, but IMHO, this is an 'eddymotion' model -- meaning, we define the interface. All other models generate volumes as the output of their predict(), so this needs to be another one. Perhaps what's missing here is an 'intermediate' model (which could eventually go into DIPY, cc/ @arokem) where the output would be consistent with the standard GP output (i.e., a mean and std functions).

jhlegarreta · 2024-06-17T12:23:24Z

Happy to have a call, but IMHO, this is an 'eddymotion' model -- meaning, we define the interface. All other models generate volumes as the output of their predict(), so this needs to be another one. Perhaps what's missing here is an 'intermediate' model (which could eventually go into DIPY, cc/ @arokem) where the output would be consistent with the standard GP output (i.e., a mean and std functions).

OK. I see that. So the input will need to be the entire DWI image then. We'll then need to figure out how this can be dealt with within the model. DIPY may give us clues, yes. For now, I do not see how/have not digged into it.

Whether the GP and the spherical kernel are giving us a reasonable prediction can still be tested in its current form (without additional strategies, running into memory issues, etc.) using a single voxel with https://github.com/nipreps/eddymotion/blob/a8742d955d9a43676d54ef194737a42db6f74015/docs/notebooks/dwi_simulated_gp.ipynb

arokem · 2024-06-17T12:40:54Z

On Mon, Jun 17, 2024 at 5:23 AM Jon Haitz Legarreta Gorroño < ***@***.***> wrote: Happy to have a call, but IMHO, this is an 'eddymotion' model -- meaning, we define the interface. All other models generate volumes as the output of their predict(), so this needs to be another one. Perhaps what's missing here is an 'intermediate' model (which could eventually go into DIPY, cc/ @arokem <https://github.com/arokem>) where the output would be consistent with the standard GP output (i.e., a mean and std functions). OK. I see that. So the input will need to be the entire DWI image then. We'll then need to figure out how this can be dealt with within the model. DIPY may give us clues, yes. For now, I do not see how/have not digged into it.

If you are implementing a new DIPY-style model here it would be best to inherit from ‘ReconstModel’ and ‘ReconstFit’, which can take care of some of the dispatch and will take care of parallelization once dipy/dipy#2593 is merged.

…

It can still be tested in its current form (without additional strategies, running into memory issues, etc.) with https://github.com/nipreps/eddymotion/blob/a8742d955d9a43676d54ef194737a42db6f74015/docs/notebooks/dwi_simulated_gp.ipynb — Reply to this email directly, view it on GitHub <#188 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA46NQVYLDA4AEG6FG5FJLZH3INDAVCNFSM6AAAAABHHZJANSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZTGI2TMOJZHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Implement Gaussian Process.

Refactor the model so that it is compatible with DIPY's reconstruction interface. Bring utilities into a new `dipy` module so that every part required for the Gaussian process (gradient angle computation, covariance methods, etc.) is more localized without proliferating small modules. Co-authored-by: Ariel Rokem <[email protected]>

Fix GP model testing: `GaussianProcessModel` is now hosted in the `model.dipy` module. Fix bvals array dimensionality and make bvecs be unit vectors in angle computation test. Fix the pairwise angle computation doctests. Test the case where the second argument to the pairwaise angle computation function is `None`. Simplify the test parameterization as the former corresponds to providing the same value for both gtab arguments. Adapt the test so that we avoid making computations if the second argument is `None`.

Fix Gaussian process: - Use a default (`test`) kernel to instantiate the Gaussian process in the test, as the spherical kernel is not yet added. - Use a regression problem to obtain 3 features (instead of 4 from the Friedman2 problem) in the test. - Normalize the gradient vectors in thest test. - Swap the signal and the gradient table when fitting the GP in the test to conform to the expected parameters. - Use the appropriate indexing and shape for the query vectors in the test. - A `dipy.core.gradients.GradientTable` does not have a `len` property so use the shape of its `gradients` attribute when checking tha the dimensionality of the data and the features match when fitting the Gaussian process. - Use named parameters when instantiating `GPFit` to avoid providing the gradient table twice. - Get the first element in the `data` vector (whose dimensionality is changed by the `GaussianProcessModel` fitting method when checking for a mask) to enable fitting. - The `model` variable in `gp_prediction` is a scikit-learn `GaussianProcessRegressor` instance do it does not have an `_gpr` attribute.

jhlegarreta · 2024-07-04T18:20:42Z

@oesteban Please check the following in commit 5bdff10:

I just made it such that the test passes, but it may not fit into what you had envisioned (especially the latter point, which will probably fail when giving it real data/using a mask).

test/test_model.py

oesteban

Good stuff - merging as soon as tests get green, thanks for this!

jhlegarreta · 2024-09-28T14:20:52Z

src/eddymotion/model/dipy.py

+        raise RuntimeError("Model is not yet fitted.")
+
+    # Extract orientations from gtab, and highly likely, the b-value too.
+    return model.predict(gtab, return_std=False)


I am trying to debug this
#228 (comment)

While at it, I see that model_gtab is not used by this method. Although I may imagine what was the underlying intention (much what we try to do in #228 (comment)), not sure how you intended to implement this. This is called from:
https://github.com/nipreps/eddymotion/pull/188/files#diff-fba3c6e90b496595a3cd3cdcbd5d562082cde0b4f21629d667d1a2260a91bd37R281

The mask is not yet used, and although we have discussed using the mask in the past, it should probably be related to the use of gtab and model_gtab.

@oesteban any thoughts about this?

jhlegarreta force-pushed the ImplementGaussianProcess branch 6 times, most recently from 2234d06 to c7c91fc Compare May 16, 2024 19:52

jhlegarreta force-pushed the ImplementGaussianProcess branch from c7c91fc to af28a0a Compare May 27, 2024 00:09

jhlegarreta mentioned this pull request May 27, 2024

ENH: Add gradient encoding direction angle computation utils #200

Merged

jhlegarreta force-pushed the ImplementGaussianProcess branch from af28a0a to beb0839 Compare May 27, 2024 22:33

This was referenced Jun 2, 2024

ENH: Add squared exponential covariance kernel #201

Closed

ENH: Add gaussian process DWI signal representation notebook #202

Closed

oesteban reviewed Jun 7, 2024

View reviewed changes

jhlegarreta force-pushed the ImplementGaussianProcess branch 3 times, most recently from 061f130 to 58efd09 Compare June 8, 2024 22:24

jhlegarreta force-pushed the ImplementGaussianProcess branch 12 times, most recently from 8ad9f7f to 0e6b51e Compare June 8, 2024 23:00

oesteban reviewed Jun 14, 2024

View reviewed changes