-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for eval(validation) data #189
Comments
Thanks for the issue!
While fitting that workflow on some training set
The functionality that (I believe) the some_model <-
boost_tree() %>%
set_engine("xgboost", validation = .2, early_stop = 10) If this doesn't cover your use case, could you please provide a minimal reprex (reproducible example) demonstrating the functionality you'd like to see? Could you also clarify your notation "eval(validation)" and "validation(eval)"? |
Simon, thanks for your response. The problem is right here:
Current flow:
Expected flow:
In the current flow, there is a data leak as recipe learns from combined dataset. |
So there is the possibility that the validation set used inside of If that is a potential issue for your recipe, I would use The api that you suggest is difficult to implement since |
Max, thanks for your response.
Would you mind helping me some example(code) of achieving this? PS: tidymodels offers a great system to build and use models. |
The following problem arose where one of the preprocessing steps was
embed:: step_lencode_glm
(generalized target encoding) and model wasxgboost
.From the documentation of
parsnip::xgb_train
it appears that evaluation data cannot be used for early stopping. While the argumentvalidation
sets aside some validation(eval) data for early stopping, its not clear if recipe is applied after splitting train and validation parts. How does this work?It might be a good idea to support something like this:
where
The text was updated successfully, but these errors were encountered: