Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'gbm3' gives much narrower predictions than 'gbm' pkg #165

Open
AMBarbosa opened this issue Aug 7, 2024 · 3 comments
Open

'gbm3' gives much narrower predictions than 'gbm' pkg #165

AMBarbosa opened this issue Aug 7, 2024 · 3 comments

Comments

@AMBarbosa
Copy link

Hi,
I'm trying to transition to gbm3, as prompted by the message that's now displayed when loading the gbm package. However, I get visibly different predictions for the same data. Here's a simple reproducible example based on random data:

set.seed(1)
N <- 1000
data <- data.frame(Y=sample(c(0, 1), N, replace = TRUE), 
                   X1=runif(N), X2=2*runif(N), X3=3*runif(N))

gbm1 <-  gbm::gbm(Y~X1+X2+X3, data=data)
gbm2 <- gbm3::gbm(Y~X1+X2+X3, data=data)

pred1 <- predict(gbm1, data, type = "response", n.trees = 100)
pred2 <- predict(gbm2, data, type = "response", n.trees = 100)

range(pred1)
# 0.2253441 0.6708913

range(pred2)
# 0.4887668 0.5017359

In this and other cases I've tried, gbm3 predicts a much narrower and (for my ecological data) less plausible range of values. What are these differences due to? Do I need to do something different to get my expected results with gbm3?

@brandongreenwell-8451
Copy link

I believe, depending on version at least, that they have pretty different defaults, which would go along way in causing such a difference. I'd go back and rerun with fixing interaction depth, learning rate, etc. to the same values and check the difference again.

@AMBarbosa
Copy link
Author

Thanks. shrinkage (which went from a 0.1 to a 0.001 default value) seems to be the most influential parameter here: if I do gbm3::gbm with shrinkage=0.1 (the default in gbm::gbm), I get much more similar (even if still not equal) results.

Is there a reason for such a drastic change in the default shrinkage, especially given that it seems to provide (at least in my case) poorer default predictions?

Regards,

@gregridgeway
Copy link
Contributor

gregridgeway commented Aug 7, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants