Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

Implement Gaussian Process as a model #53

Closed
oesteban opened this issue Sep 14, 2021 · 3 comments
Closed

Implement Gaussian Process as a model #53

oesteban opened this issue Sep 14, 2021 · 3 comments
Assignees
Labels
effort: high Estimated high-effort task impact: high Estimated high-impact task models
Milestone

Comments

@oesteban
Copy link
Member

In order to compare baseline performance vs. FSL eddy.

@oesteban oesteban added effort: high Estimated high-effort task impact: high Estimated high-impact task models labels Sep 14, 2021
@oesteban oesteban added this to the 0.2.0 milestone Sep 14, 2021
@oesteban oesteban self-assigned this Sep 14, 2021
@josephmje
Copy link
Contributor

josephmje commented Dec 7, 2021

Adding notes from our 2021-12-07 meeting here:

A first stab at implementing the Gaussian process model is at #60.

At the moment, we haven't done any tuning or optimization of the model. If we are going to eventually compare it to FSL eddy, that will need to be done.

For testing, some suggestions are to use:

  • simulated data from FiberFox
  • IXI dataset
  • a high quality dataset with certain voxels masked (perhaps near a white matter tract or some CSF in the middle of a ventricle)

Will eventually need to add better parallelization. Currently, the DTI model chunks the data before fitting the model. We can also implement that code here. For the future, we may need to dig into dipy and can use joblib/dask to further improve the parallelization.

See also comments: https://github.com/nipreps/eddymotion/pull/60/files#r764144040

@dPys
Copy link
Contributor

dPys commented Aug 27, 2022

@josephmje - Not sure if you or anyone else are still working on the GP model, but figured I'd chime in with a few thoughts to get the conversation going again:

The GP model needs at least a PoC design for the kernel before it can be tuned. Here's a first template for what that might look like. High-Level Requirements:
*Isotropism, uniform affinity for all angular directions.
*Symmetry around the origin.

The way that most folks have seemed to address this in the context of q-space is to factorize the covariance function into (a) an angular part and (b) a radial part, with the addition of (c) an error term.

The angular part can be constructed using Legendre polynomials, whereby symmetry about the origin can be achieved by excluding odd terms and ensuring that their coefficients are positive. Past work seems to have used an order <=6, with the consensus being that grabbing the first four terms is best.

For the radial part, we could start with sklearn's standard RBF kernel and see how far it gets us. My biggest concern would be performance...

and then the final kernel would would look something like the following, with ~6 hyperparams that'd still need to be tuned:

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process import kernels
from sklearn.gaussian_process.kernels import StationaryKernelMixin, GenericKernelMixin, Kernel
from numpy.polynomial.legendre import Legendre
                
def legendre_polynomial(n_gradients, order=6):
    """Even terms only"""
    x = np.arange(n_gradients)
    x = x - x.max()/2
    num_pol = range(0, order + 1, 2)
    y = np.ones((len(num_pol),len(x)))   
    coeff = np.eye(int(order/2 + 1))
    
    for i in num_pol:
        myleg = Legendre(coeff[i])
        y[i,:] = myleg(x) 
        if i>0:
            y[i,:] = y[i,:] - np.mean(y[i,:])
            y[i,:] = y[i,:]/np.max(y[i,:])
    return y

class LegendreKernel(StationaryKernelMixin, GenericKernelMixin, Kernel):
    def __init__(self, order=6, length_scale_bounds=(1e-5, 1e5)):
        self.order = order
        self.length_scale_bounds = length_scale_bounds
        super().__init__()
        
    def __call__(self, X, Y=None, eval_gradient=False):
        X = np.atleast_2d(X)
        if Y is None:
            Y = X
        else:
            Y = np.atleast_2d(Y)
            
        if eval_gradient:
            raise ValueError("Gradient can only be evaluated when Y is None.")
            
        K = np.zeros((X.shape[0], Y.shape[0]))
        for i in range(X.shape[0]):
            for j in range(Y.shape[0]):
                K[i,j] = np.dot(legendre_polynomial(X.shape[1], order=self.order).T, legendre_polynomial(Y.shape[1], order=self.order))
        return K
    
    def diag(self, X):
        return np.ones(X.shape[0])
    
    def is_stationary(self):
        return True

q_lengthscale=1
noise_level=1e-2

kernel = (kernels.WhiteKernel(noise_level=noise_level, noise_level_bounds=(1e-10, 1e1)) * kernels.WhiteKernel(noise_level=noise_level, noise_level_bounds=(1e-10, 1e1)) * kernels.WhiteKernel(noise_level=noise_level, noise_level_bounds=(1e-10, 1e1)) * kernels.RBF(length_scale=q_lengthscale, length_scale_bounds=(1e-2, 1e3)) * LegendreKernel(order=6))
    
param = {"solver": GaussianProcessRegressor(n_restarts_optimizer=10, alpha=0.1)}

The bulk of the work on this would be the initial round of tuning done during a pre-training stage, and using a synthetic, fiber-fox generated sample as you originally had proposed. The objective that we'd want to minimize is the negative log marginal likelihood, and the data itself would be synthesized such that training and validation sets have similar amounts of noise / order of magnitude but perhaps different numbers of directions compared to evaluation data.

A ton of work is involved in this obviously, but figured it couldn't hurt to discuss more. WDYT?

@oesteban
Copy link
Member Author

A base implementation of the GP using sklearn is now in the codebase. This has been addressed 🎉

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
effort: high Estimated high-effort task impact: high Estimated high-impact task models
Projects
None yet
Development

No branches or pull requests

3 participants