Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nearest Neighbor Gaussian Processes (NNGP) models #371

Closed
paul-buerkner opened this issue Mar 9, 2018 · 6 comments
Closed

Nearest Neighbor Gaussian Processes (NNGP) models #371

paul-buerkner opened this issue Mar 9, 2018 · 6 comments
Labels

Comments

@paul-buerkner
Copy link
Owner

Lu Zhang provides a very intersting case study about NNGPs in Stan (http://mc-stan.org/users/documentation/case-studies/nngp.html). It could be worth implementing them in brms at some point as traditional GPs scale badly with the number of observations, while the situation seems to be better with NNGPs.

@paul-buerkner
Copy link
Owner Author

Approximate GPs, which are implemented in brms as of version 2.7.0, serve a similar purpose than nearest neighbor GPs (NNGPs) in that they reduce to computational burden of exact GPs and so make GPs feasible for much larger data sets. NNGPs seem to be quite complex to code in Stan and I am not sure how easy it is to scale them to higher dimensions or non-isotropic GPs.

As it currently stands, I don't think NNGPs will bring much improvement to what brms already offers and so I am closing this issue.

@mikoontz
Copy link

mikoontz commented Mar 25, 2019

Do any of the other (relatively) recently-developed methods for working with large, geospatial datasets offer additional benefits beyond approximate GPs as implemented in brms?

There was a recent "friendly competition" amongst methods that does a nice job of laying out some of the particular features of each method: https://github.com/finnlindgren/heatoncomparison.

I'm including the link to the GitHub repo (which contains the link to the paper) as a jumping off point, but can add more details here (if appropriate) as I learn what other benefits these methods might bring.

(I'm adding it here because this is the first search hit for "nearest neighbor gaussian process brms" so I assume there'd be a good amount of overlapping interest for folks finding this issue with other kinds of methods for big, geospatial datasets)

@paul-buerkner
Copy link
Owner Author

That looks very interesting, thanks! @gabriuma this could be of interest to you!

As far as I can tell, the basis function approach we use in brms was not among those compared (it is too new I guess). Or am I mistaken?

In any case, it would be nice to hear your insights about methods that you think may be worthing looking at when it comes to implementing them in brms.

@danfosterfire
Copy link

danfosterfire commented Nov 3, 2021

My understanding is that the basis function approach used in brms does not provide reliable information about the parameters of the "true" (full) gaussian process. This doesn't work well when the GP parameters are the parameters of interest, because in a-priori setting the number of knots we're choosing to smooth out the finest scales of variation without any knowledge of the range parameter of the GP.* The nearest neighbor approach seems to offer a better approximation of the full GP's spatial parameters, because we're assuming 0 correlation (rather than very small correlation) at the largest scales, and not making assumptions about the (relatively strong) correlation at fine scales. For my research, having the nearest neighbor approximate GP implemented in brms would be very valuable.

*Your paper offers some insights towards setting the number of knots, but suggests that we need an unhelpfully-large number of knots in situations (like mine) where the range of spatial correlation is very small relative to the bounding box of all the data points.

@tillahoffmann
Copy link

Piggy-backing on this thread, we implemented GPs on graphs in Stan here with more details in section 3 of this manuscript and examples here. Nearest-neighbors GPs are a special case of graph GPs where there are only "dependency edges" from nodes to its k nearest neighbors. Graph GPs can, for example, also be used for non-stationary kernels. The current version only supports isotropic kernels, but there is an open PR to support different scales in different dimensions (onnela-lab/gptools#23). Unfortunately, there was a bug in the Stan math library (stan-dev/math#3084) so we'll have to wait for the next release. Would be cool to integrate this functionality into brms.

@paul-buerkner
Copy link
Owner Author

Thank you for these details! Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants