ENH: Improving GP models #244

oesteban · 2024-10-24T15:13:12Z

No description provided.

codecov · 2024-10-24T15:22:55Z

Codecov Report

Attention: Patch coverage is 75.67568% with 18 lines in your changes missing coverage. Please review.

Project coverage is 66.24%. Comparing base (d3f5eac) to head (6e6ff4a).
Report is 11 commits behind head on main.

Files with missing lines	Patch %	Lines
src/eddymotion/model/_sklearn.py	75.67%	16 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #244      +/-   ##
==========================================
+ Coverage   66.18%   66.24%   +0.05%     
==========================================
  Files          19       19              
  Lines         905      942      +37     
  Branches      112      119       +7     
==========================================
+ Hits          599      624      +25     
- Misses        264      274      +10     
- Partials       42       44       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

oesteban · 2024-10-24T21:30:29Z

@jhlegarreta our simulated data may not be realistic enough for our tests:

That said, convergence issues also happen when using b=3000 of HCP.

…tion

@jhlegarreta

@jhlegarreta @yasseraleman -- I may have found the potential culprit that makes fitting so weird. By looking at the covariance vs. angle plots (at the beginning of the notebook), I ended up realizing that they massaged the data before calculating the covariance. They certainly removed the mean (I don't know if gradient-wise mean or the grand mean), and possibly scaled to one-standard-deviation. Then I went back to the paper and found the little comment I included in the docstring of ``EddymotionGPR.fit()``. This doesn't completely resolve all problems, but I'm positive it's one little step we needed to take.

oesteban · 2024-10-26T08:20:19Z

@jhlegarreta @yasseraleman -- the new plot accounting for the normalization of the data differently (see my commit message in 3136256):

Perhaps the optimizer is moving a towards $\frac{\pi}{2}$ because it is indeed a better fit than a = 1.23.

jhlegarreta · 2024-10-26T20:42:37Z

@jhlegarreta @yasseraleman -- the new plot accounting for the normalization of the data differently (see my commit message in 3136256):

@oesteban 🎉 looks reasonable.

It is still unclear to me how :math:f^{*} is interpolated from :math:\hat{\mathbf{f}} (grand mean?), or whether the fitted model accounts for that factor (I doubt it).

Maybe I'm being very naive, but they assume that $f^{*}$ is the target value, and simply add $\hat{f}$ once they're done with the estimation following the text? That said, I have not found where in the eddy folks they add back the mean:
https://git.fmrib.ox.ac.uk/fsl/eddy/-/blob/2480dda293d4cec83014454db3a193b87921f6b0/DiffusionGP.cpp#L252

Maybe since they add it back to the predictions when writing the volumes as they have kept track of the value. But have not found the place, unless it is this one:
https://git.fmrib.ox.ac.uk/fsl/eddy/-/blob/2480dda293d4cec83014454db3a193b87921f6b0/DiffusionGP.cpp#L132

oesteban · 2024-10-27T07:04:08Z

Maybe I'm being very naive, but they assume that $f^{*}$ is the target value, and simply add $\hat{f}$ once they're done with the estimation following the text? That said, I have not found where in the eddy folks they add back the mean

I don't think that's naive and I understand the same of the text. But then the question is you can obtain $\bar{\mathbf{f}}$ by just calculating the mean along the voxels axis (obtaining one mean per b-vector). However, in prediction, you don't know the mean of the orientations. So perhaps eddy is kind of cheating and getting the parameter from the testing data (fair enough, which would explain why you didn't find it in the prediction part of the code) or they are actually interpolating $\bar{\mathbf{f}}$ on the surface (which doesn't seem to be implemented within eddy).

oesteban · 2024-10-27T07:06:03Z

Addendum: This would invalidate our cross-validation unless we regress data to $\bar{\mathbf{f}}$ before entering cross-validation.

src/eddymotion/model/_sklearn.py

@jhlegarreta

It seems I was wrong about the need for standardizing, Scikit-learn's implementation is sufficient and follows what the paper specifies. There are a few additional details that have surfaced as relevant: * $\sigma^2$ should not be very large (and ~100) was VERY large. With small values, it seems to pacify optimizers with more plausible outcomes. * $a$ may take values above $\frac{\pi}{2}$, even though $\theta$ will always be smaller than that. Indeed, it seems for our real dataset the optimal $a$ is indeed around $\frac{\pi}{2}$. * I haven't cracked the manipulations of the covariance plot yet, but it's clear to me now that they are indeed considering $a$ and $\lambda$. The most relevant inclusion in this commit is the repetition of the covariance plot with the covariance of predicted data, which is very encouraging. cc @jhlegarreta.

oesteban · 2024-10-28T12:49:48Z

It occurred to me that we could plot the covariance of predictions (out of sample, btw):

Promising!

(see updated notebook)

oesteban · 2024-10-28T12:57:35Z

Improved:

oesteban force-pushed the enh/gp-model-improvements branch 2 times, most recently from 28c371e to 22686a0 Compare October 26, 2024 05:58

enh: extend support for options

34309f0

oesteban force-pushed the enh/gp-model-improvements branch from 22686a0 to 5fd0125 Compare October 26, 2024 06:07

oesteban added 5 commits October 26, 2024 08:07

sty: lambda_s -> beta_l (and a -> beta_a)

9e65c34

enh: check the covariance matrix of simulated data

5fd0125

enh: add 'CG' as an optimizer

10b1018

enh: add covariance plot of simulated signal and covariance visualiza…

35a8530

…tion

jhlegarreta reviewed Oct 27, 2024

View reviewed changes

src/eddymotion/model/_sklearn.py Outdated Show resolved Hide resolved

enh: enable spherical kernel plotting on last cell of notebook

6e6ff4a

oesteban marked this pull request as ready for review October 28, 2024 13:01

oesteban merged commit ef91d4f into main Oct 28, 2024
7 checks passed

oesteban deleted the enh/gp-model-improvements branch October 28, 2024 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Improving GP models #244

ENH: Improving GP models #244

oesteban commented Oct 24, 2024

codecov bot commented Oct 24, 2024 •

edited

Loading

oesteban commented Oct 24, 2024

oesteban commented Oct 26, 2024

jhlegarreta commented Oct 26, 2024

oesteban commented Oct 27, 2024

oesteban commented Oct 27, 2024

oesteban commented Oct 28, 2024

oesteban commented Oct 28, 2024

ENH: Improving GP models #244

ENH: Improving GP models #244

Conversation

oesteban commented Oct 24, 2024

codecov bot commented Oct 24, 2024 • edited Loading

Codecov Report

oesteban commented Oct 24, 2024

oesteban commented Oct 26, 2024

jhlegarreta commented Oct 26, 2024

oesteban commented Oct 27, 2024

oesteban commented Oct 27, 2024

oesteban commented Oct 28, 2024

oesteban commented Oct 28, 2024

codecov bot commented Oct 24, 2024 •

edited

Loading