xgboost default nrounds should not be set #261

mb706 · 2022-12-14T16:18:28Z

I think many users will assume that the "default" values that are given for a Learner are relatively save values that lead to ok performance in many cases. However, the nrounds for xgboost is chosen to be 1 as a "nonsense default", with the recommendation to tune this value (according to the documentation). I think that in this case the value should not be set at all and training without setting it should be an error (which is the default behaviour of xgboost!). Errors should be thrown whenever a user does something that is likely undesired behaviour.

The text was updated successfully, but these errors were encountered:

be-marc · 2023-01-19T10:22:22Z

So your suggestion is to check nrounds first in $.train()? I think that's a good idea. @mllg Didn't we set 1 as the default because the parameter had a required tag? I don't see a required tag anymore, so construction would be possible without setting nrounds.

mllg · 2023-01-19T13:59:38Z

You could argue the same for all hyperparameters that are sensitive to tuning but have some default. It is not clear to me where to draw the line.

be-marc · 2023-01-19T14:29:37Z

It is not clear to me where to draw the line.

When the package authors provide no default? XGBoost throws an error when nrounds is not set.

mb706 · 2023-01-19T14:59:41Z

If a hyperparameter (with reasonably strong impact on performance) does not have a default value in the algorithm / function being wrapped then I would lean towards not setting it in the Learner. If there is no default but we know that some value leads to ok performance most of the time, then we could also set it to that value; I would not insist on this either way here.

The principle I am going by here is: If something is happening that the user likely does not want to happen, then an error should be thrown. We write in our documentation that the given value is "nonsense" and should be tuned, so we seem to be confident that the user does not want to train xgboost with nrounds = 1.

I would usually ignore it when an algorithm has nonsense-defaults itself, in the same way that we usually do not put bugfixes for other packages in our Learner code. I am okay with overriding hyperparameters with little performance impact, and maybe things that we have explicit mlr3 interfaces for (e.g. number of threads, even if it does have an impact on performance).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xgboost default nrounds should not be set #261

xgboost default nrounds should not be set #261

mb706 commented Dec 14, 2022

be-marc commented Jan 19, 2023 •

edited

Loading

mllg commented Jan 19, 2023

be-marc commented Jan 19, 2023

mb706 commented Jan 19, 2023

xgboost default nrounds should not be set #261

xgboost default nrounds should not be set #261

Comments

mb706 commented Dec 14, 2022

be-marc commented Jan 19, 2023 • edited Loading

mllg commented Jan 19, 2023

be-marc commented Jan 19, 2023

mb706 commented Jan 19, 2023

be-marc commented Jan 19, 2023 •

edited

Loading