Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect predictions from fit_resamples() when applied to lmer model #44

Open
a-difabio opened this issue Jun 20, 2022 · 2 comments
Open
Labels
bug an unexpected problem or unintended behavior

Comments

@a-difabio
Copy link
Contributor

a-difabio commented Jun 20, 2022

I am having trouble using the tune::fit_resamples() function on a lmer model (from the multilevelmod package).
In particular, it looks like that when the predictions for the assessment set are calculated, the model doesn't properly account for all the possible combinations of grouping levels.

I have included a reprex in which I show that the results of a predict() call on a lmer object are different than the predictions obtained from a fit_resamples() call (using collect_predictions()).

library(tidyverse)
library(tidymodels)
library(multilevelmod)

data(mpg, package = "ggplot2")

set.seed(123)

lmer_model = linear_reg() %>% 
  set_engine("lmer")

lmer_workflow = workflow() %>% 
  add_variables(outcomes = cty,
                predictors = c(year, manufacturer, model)) %>% 
  add_model(lmer_model, formula = cty ~ year + (1|manufacturer/model))

mpg_split = mpg %>% validation_split(prop = 3/4)

analysis = mpg_split$splits[[1]] %>% analysis()
assessment = mpg_split$splits[[1]] %>% assessment()

# using predict() on the assessment dataset works as expected
predicted_via_workflow = lmer_workflow %>%
  fit(analysis) %>%
  extract_fit_engine() %>%
  predict(assessment) %>%
  plot()

# the predictions from the fit_resamples() function do not vary per group
predicted_via_tune = lmer_workflow %>% 
  fit_resamples(mpg_split, control = control_resamples(allow_par = FALSE,
                                                       save_pred = TRUE)) %>% 
  collect_predictions() %>%
  pluck(".pred") %>%
  plot()

Created on 2022-06-20 by the reprex package (v2.0.1)

@juliasilge juliasilge transferred this issue from tidymodels/tune Jun 20, 2022
@juliasilge
Copy link
Member

Are you sure you have the latest version, which includes the fix #41 for #38? It went to CRAN on June 17.

@a-difabio
Copy link
Contributor Author

I believe I am using the latest version of the package:

sessioninfo::package_info(pkgs = "multilevelmod", dependencies = FALSE)
#>  package       * version date (UTC) lib source
#>  multilevelmod   1.0.0   2022-06-17 [1] CRAN (R 4.2.0)

Created on 2022-06-21 by the reprex package (v2.0.1)

In fact, I think that before fix #41 this same code would have thrown an error without predicting anything, while now the fitted workflow can be used to predict new values.

@juliasilge juliasilge added the bug an unexpected problem or unintended behavior label Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants