New model request: gpboost Tree-Boosting with Gaussian Process and Mixed Effects Models #47

schelhorn · 2022-10-20T07:53:16Z

The gpboost package on CRAN by @fabsig explains itself as such:

Combining Tree-Boosting with Gaussian Process and Mixed Effects Models
An R package that allows for combining tree-boosting with Gaussian process and mixed effects models. It also allows for independently doing tree-boosting as well as inference and prediction for Gaussian process and mixed effects models. See https://github.com/fabsig/GPBoost for more information on the software and Sigrist (2020) <arXiv:2004.02653> and Sigrist (2021) <arXiv:2105.08966> for more information on the methodology.

I would suggest that it would make a nice extension to {multilevelmod} due to its ability to model non-linear relationships and work well with high-cardinality categorical data.

From the paper abstract of the approach:

We introduce a novel way to combine boosting with Gaussian process and mixed effects models. This allows for relaxing, first, the zero or linearity assumption for the prior mean function in Gaussian process and grouped random effects models in a flexible non-parametric way and, second, the independence assumption made in most boosting algorithms. The former is advantageous for prediction accuracy and for avoiding model misspecifications. The latter is important for efficient learning of the fixed effects predictor function and for obtaining probabilistic predictions. Our proposed algorithm is also a novel solution for handling high-cardinality categorical variables in tree-boosting. In addition, we present an extension that scales to large data using a Vecchia approximation for the Gaussian process model relying on novel results for covariance parameter inference. We obtain increased prediction accuracy compared to existing approaches on multiple simulated and real-world data sets.

And the main text of the paper:

In summary, both the linearity assumption in Gaussian process models and the independence assumption in boosting are often questionable. The goal of this article is to relax these restrictive assumptions by combining boosting with Gaussian process and mixed effects models. Specifically, we propose to model the mean function using an ensemble of base learners, such as regression trees (Breiman et al., 1984), learned in a stage-wise manner using boosting, and the second-order structure is modeled using a Gaussian process or mixed effects model. In doing so, the parameters of the covariance function are estimated jointly with the mean function; see Section 2 for more details.

The paper is very well written and the package is actively developed on Github, with the last commit from two months ago. Multiple usage examples are linked here, the most comprehensive being this one. Model hyperparameters are explained here.

From the documention, I believe it can work with the following responses:
regression, regression_l1, huber, binary, lambdarank, multiclass

The text was updated successfully, but these errors were encountered:

fabsig · 2022-10-20T09:05:20Z

@schelhorn: many thanks for this suggestion!

Just a small clarification: currently, GPBoost supports the following response distributions: gaussian, bernoulli_probit (= binary), bernoulli_logit, poisson, gamma; see here for a list of currently supported likelihoods.

hfrick · 2023-11-01T15:24:19Z

Thank you for the detailed issue with the references 🙌 It's sitting here until the next round of triaging/implementing new models but it hasn't fallen off the radar.

tdemarchin · 2024-02-29T14:02:10Z

Hi, Upvoting this as I would be very interested to have GPboost included in the tidymodels panel.

hfrick added the feature a feature request or enhancement label Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New model request: gpboost Tree-Boosting with Gaussian Process and Mixed Effects Models #47

New model request: gpboost Tree-Boosting with Gaussian Process and Mixed Effects Models #47

schelhorn commented Oct 20, 2022 •

edited

Loading

fabsig commented Oct 20, 2022

hfrick commented Nov 1, 2023

tdemarchin commented Feb 29, 2024

New model request: gpboost Tree-Boosting with Gaussian Process and Mixed Effects Models #47

New model request: gpboost Tree-Boosting with Gaussian Process and Mixed Effects Models #47

Comments

schelhorn commented Oct 20, 2022 • edited Loading

fabsig commented Oct 20, 2022

hfrick commented Nov 1, 2023

tdemarchin commented Feb 29, 2024

schelhorn commented Oct 20, 2022 •

edited

Loading