Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] [QUESTION] I can't reproduce xgb.cv() with same specifications and seeds. #10732

Open
serkor1 opened this issue Aug 22, 2024 · 0 comments
Open

Comments

@serkor1
Copy link

serkor1 commented Aug 22, 2024

Hi,

Thank you for an amazing package. I have question, which may, or may not, be related to a possible bug somewhere in source code. (Or it might just be me who lacks the knowledge about about gradient boosting machines!).

I am running the xgb.cv()-function twice, with the exact same specification, but the resulting evaluation_log differs between the two runs, even when using the same folds, parameters and seed. My question is as follows,

Isn't the evaluation_log supposed to be equal if run twice?

Below is reprex to demonstrate the question (or possible bug).

reprex

# define data
data_input <- xgboost::xgb.DMatrix(
  data  = as.matrix(
    mtcars[,grep(pattern = "mpg",x = colnames(mtcars),invert = TRUE)]
  ),
  label = mtcars$mpg
)
# first run
set.seed(1903)
first_run <- xgboost::xgb.cv(
  params = list(
    booster = "gblinear"
  ),
  nrounds = 1,
  nfold   = 10,
  metrics = "rmse",
  data    = data_input,
  verbose = FALSE
)
# second run
set.seed(1903)
second_run <- xgboost::xgb.cv(
  params = list(
    booster = "gblinear"
  ),
  nrounds = 1,
  nfold   = 10,
  metrics = "rmse",
  data    = data_input,
  verbose = FALSE
)
# test for equality
setequal(
  first_run$folds,
  second_run$folds
)
#> [1] TRUE

setequal(
  first_run$params,
  second_run$params
)
#> [1] TRUE

setequal(
  first_run$evaluation_log,
  second_run$evaluation_log
)
#> [1] FALSE

Created on 2024-08-22 with reprex v2.1.1

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Zorin OS 17.1
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language en_US:en
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Copenhagen
#>  date     2024-08-22
#>  pandoc   3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.3   2024-06-21 [1] CRAN (R 4.4.1)
#>  data.table    1.15.4  2024-03-30 [1] CRAN (R 4.4.0)
#>  digest        0.6.37  2024-08-19 [1] CRAN (R 4.4.1)
#>  evaluate      0.24.0  2024-06-10 [1] CRAN (R 4.4.0)
#>  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
#>  fs            1.6.4   2024-04-25 [1] CRAN (R 4.4.0)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.4.0)
#>  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.4.0)
#>  knitr         1.48    2024-07-07 [1] CRAN (R 4.4.1)
#>  lattice       0.22-6  2024-03-20 [4] CRAN (R 4.4.0)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>  Matrix        1.7-0   2024-04-26 [4] CRAN (R 4.4.0)
#>  reprex        2.1.1   2024-07-06 [1] CRAN (R 4.4.1)
#>  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.4.0)
#>  rmarkdown     2.28    2024-08-17 [1] CRAN (R 4.4.1)
#>  rstudioapi    0.16.0  2024-03-24 [3] CRAN (R 4.4.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>  withr         3.0.1   2024-07-31 [1] CRAN (R 4.4.1)
#>  xfun          0.47    2024-08-17 [1] CRAN (R 4.4.1)
#>  xgboost       1.7.8.1 2024-07-24 [1] CRAN (R 4.4.1)
#>  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.4.1)
#> 
#>  [1] /home/serkan/R/x86_64-pc-linux-gnu-library/4.4
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Best,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant