Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

ENH: Implement Gaussian Process #188

Merged
merged 9 commits into from
Jul 4, 2024

Conversation

jhlegarreta
Copy link
Collaborator

Implement Gaussian Process.

@jhlegarreta jhlegarreta force-pushed the ImplementGaussianProcess branch 6 times, most recently from 2234d06 to c7c91fc Compare May 16, 2024 19:52
@yibeichan
Copy link

I made a draft PR on @jhlegarreta 's fork here jhlegarreta#1
I updated calculate_angle_matrix in src/eddymotion/model/utils.py
but I realize that my PR has some conflicts that need to be merged due to the latest updates in nippers/eddymotion. It's probably too much work to merge such huge conflicts just for one function.
I explained to Jon why I updated the function so Jon can decide whether you want to adopt the changes (by copying it from the link above)

Regarding compute_multi_shell_covariance_matrix, I tried to work on it but had some problems

  1. the input parameters in the current version of the function is not complete, for example nvals and bvecs needs to be defined
  2. also, the current calculate_angle_matrix only works for single-shell data, so we need to update this function first and then call it in compute_multi_shell_covariance_matrix
  3. also, make sure we are consistent on whether to use k as input in those covariance matrices or initiate k within the function.
  4. Some parameter names are not consistent. For example, theta should be angle_mat, but we use different names across functions.

Thank you, @jhlegarreta, for explaining everything to me and getting me started on this work (which is different from what I've done before). I'm happy to contribute more in the future if needed.

@jhlegarreta
Copy link
Collaborator Author

jhlegarreta commented May 27, 2024

To be rebased on main once PRs #198, #199 and #200 (and probably others where part of the features in this PR, e.g. compute_exponential_function, compute_spherical_function, compute_squared_exponential_function) are likely to be spread for easier/independent testing/development) are merged.

Copy link
Member

@oesteban oesteban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally could have a look into this. I think we have generated some overlap with other efforts, so a first step would be to remove duplicate work (IMHO).

The only major issue I can see is that it looks like the model confuses responsibilities with the estimator at points.

Let's set the following goal to this PR:

Be able to produce one prediction (i.e., one missing orientation) for one voxel (having all the other b-vecs/vals). This should not bother about:

  • hyperparameter update
  • kernel
  • motion and EC parameters update

src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/base.py Outdated Show resolved Hide resolved
src/eddymotion/model/utils.py Outdated Show resolved Hide resolved
test/test_model.py Outdated Show resolved Hide resolved
@jhlegarreta jhlegarreta force-pushed the ImplementGaussianProcess branch 3 times, most recently from 061f130 to 58efd09 Compare June 8, 2024 22:24
@jhlegarreta
Copy link
Collaborator Author

jhlegarreta commented Jun 8, 2024

Re #188 (review) agreed.

Have done a local copy of the old file to add a documentation file as time permits. For now,

  • This reduces the whole thing to making the prediction.
  • Have removed all duplicates that this had after having split the contents across multiple PRs (some already merged, others being reviewed, etc.)

Will improve things in subsequent commits (or separate PRs): actually use the DWI data, use bvec indexing, etc.

@jhlegarreta jhlegarreta force-pushed the ImplementGaussianProcess branch 12 times, most recently from 8ad9f7f to 0e6b51e Compare June 8, 2024 23:00
@jhlegarreta
Copy link
Collaborator Author

Have gone through the comments but I am hesitant about them:

  • Now I do see that we should not provide the diffusion signal to the __init__ method. At least not at this stage of the development.
  • Re ENH: Implement Gaussian Process #188 (comment) ENH: Implement Gaussian Process #188 (comment) I think here we are back to the discussion about our data: reconstructing the 3D volume in here means that the fit/predict method get the entire data, which is unfeasible given the size. As I see things, reconstructing the 3D volume should be done in the caller. Best would be to discuss this over a call.

@oesteban
Copy link
Member

Now I do see that we should not provide the diffusion signal to the __init__ method. At least not at this stage of the development.

👍

Re #188 (comment) #188 (comment) I think here we are back to the discussion about our data: reconstructing the 3D volume in here means that the fit/predict method get the entire data, which is unfeasible given the size. As I see things, reconstructing the 3D volume should be done in the caller. Best would be to discuss this over a call.

Happy to have a call, but IMHO, this is an 'eddymotion' model -- meaning, we define the interface. All other models generate volumes as the output of their predict(), so this needs to be another one. Perhaps what's missing here is an 'intermediate' model (which could eventually go into DIPY, cc/ @arokem) where the output would be consistent with the standard GP output (i.e., a mean and std functions).

@jhlegarreta
Copy link
Collaborator Author

jhlegarreta commented Jun 17, 2024

Happy to have a call, but IMHO, this is an 'eddymotion' model -- meaning, we define the interface. All other models generate volumes as the output of their predict(), so this needs to be another one. Perhaps what's missing here is an 'intermediate' model (which could eventually go into DIPY, cc/ @arokem) where the output would be consistent with the standard GP output (i.e., a mean and std functions).

OK. I see that. So the input will need to be the entire DWI image then. We'll then need to figure out how this can be dealt with within the model. DIPY may give us clues, yes. For now, I do not see how/have not digged into it.

Whether the GP and the spherical kernel are giving us a reasonable prediction can still be tested in its current form (without additional strategies, running into memory issues, etc.) using a single voxel with https://github.com/nipreps/eddymotion/blob/a8742d955d9a43676d54ef194737a42db6f74015/docs/notebooks/dwi_simulated_gp.ipynb

@arokem
Copy link
Collaborator

arokem commented Jun 17, 2024 via email

Implement Gaussian Process.
@oesteban oesteban force-pushed the ImplementGaussianProcess branch from d4f4e75 to e90ebae Compare July 4, 2024 07:21
oesteban and others added 5 commits July 4, 2024 09:49
Refactor the model so that it is compatible with DIPY's reconstruction interface.

Bring utilities into a new `dipy` module so that every part required for the Gaussian process (gradient angle computation, covariance methods, etc.) is more localized without proliferating small modules. 

Co-authored-by: Ariel Rokem <[email protected]>
@oesteban oesteban force-pushed the ImplementGaussianProcess branch from 671a0eb to e75b6e5 Compare July 4, 2024 14:37
@jhlegarreta jhlegarreta force-pushed the ImplementGaussianProcess branch 2 times, most recently from 51e55be to 83a360e Compare July 4, 2024 17:05
Fix GP model testing: `GaussianProcessModel` is now hosted in the
`model.dipy` module.

Fix bvals array dimensionality and make bvecs be unit vectors in angle
computation test.

Fix the pairwise angle computation doctests.

Test the case where the second argument to the pairwaise angle
computation function is `None`. Simplify the test parameterization as
the former corresponds to providing the same value for both gtab
arguments. Adapt the test so that we avoid making computations if the
second argument is `None`.
@jhlegarreta jhlegarreta force-pushed the ImplementGaussianProcess branch from 83a360e to 0c32def Compare July 4, 2024 17:20
Fix Gaussian process:
- Use a default (`test`) kernel to instantiate the Gaussian process in
  the test, as the spherical kernel is not yet added.
- Use a regression problem to obtain 3 features (instead of 4 from the
  Friedman2 problem) in the test.
- Normalize the gradient vectors in thest test.
- Swap the signal and the gradient table when fitting the GP in the test
  to conform to the expected parameters.
- Use the appropriate indexing and shape for the query vectors in the
  test.
- A `dipy.core.gradients.GradientTable` does not have a `len` property
  so use the shape of its `gradients` attribute when checking tha the
  dimensionality of the data and the features match when fitting the
  Gaussian process.
- Use named parameters when instantiating `GPFit` to avoid providing the
  gradient table twice.
- Get the first element in the `data` vector (whose dimensionality is
  changed by the `GaussianProcessModel` fitting method when checking for a mask) to enable fitting.
- The `model` variable in `gp_prediction` is a scikit-learn
  `GaussianProcessRegressor` instance do it does not have an `_gpr` attribute.
@jhlegarreta jhlegarreta force-pushed the ImplementGaussianProcess branch from 05670e6 to 5bdff10 Compare July 4, 2024 18:18
@jhlegarreta
Copy link
Collaborator Author

jhlegarreta commented Jul 4, 2024

@oesteban Please check the following in commit 5bdff10:

I just made it such that the test passes, but it may not fit into what you had envisioned (especially the latter point, which will probably fail when giving it real data/using a mask).

test/test_model.py Outdated Show resolved Hide resolved
Copy link
Member

@oesteban oesteban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff - merging as soon as tests get green, thanks for this!

@oesteban oesteban merged commit ba5d692 into nipreps:main Jul 4, 2024
6 checks passed
@jhlegarreta jhlegarreta deleted the ImplementGaussianProcess branch July 4, 2024 20:18
raise RuntimeError("Model is not yet fitted.")

# Extract orientations from gtab, and highly likely, the b-value too.
return model.predict(gtab, return_std=False)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to debug this
#228 (comment)

While at it, I see that model_gtab is not used by this method. Although I may imagine what was the underlying intention (much what we try to do in #228 (comment)), not sure how you intended to implement this. This is called from:
https://github.com/nipreps/eddymotion/pull/188/files#diff-fba3c6e90b496595a3cd3cdcbd5d562082cde0b4f21629d667d1a2260a91bd37R281

The mask is not yet used, and although we have discussed using the mask in the past, it should probably be related to the use of gtab and model_gtab.

@oesteban any thoughts about this?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants