Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misalignment of Date Features when fiscal_year_start != 1 #157

Open
AndrewKostandy opened this issue Apr 4, 2024 · 1 comment
Open

Misalignment of Date Features when fiscal_year_start != 1 #157

AndrewKostandy opened this issue Apr 4, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@AndrewKostandy
Copy link

AndrewKostandy commented Apr 4, 2024

Hi,

Thank you for your work on this package. I'm running into an issue with the created Date features when setting fiscal_year_start to a value other than 1. For example, when using afiscal_year_start value of 11 below, I would expect the Date_half and Date_quarter values to be 1 for November. For Date_month it could be either 11 or 1 (understandable either way), but for Date_month.lbl it should be "November". Note that October & November (identified in the Date column) should be in different quarters here (Q4 & Q1 respectively). However, they're both in Q3.

library(tidyverse)
library(finnts)
#> Loading required package: modeltime

df <- tibble(
  Date = seq.Date(from = ymd("2020-11-01"), to = ymd("2023-10-01"), by = "month"),
  y = rnorm(36, 100, 5),
  x = rnorm(36, 100, 5),
  id = "y"
)

run_info <- set_run_info(
  experiment_name = "Run 1",
  run_name = "Date Extraction Check"
)
#> Finn Submission Info
#> • Experiment Name: Run 1
#> • Run Name: Date Extraction Check-20240404T110928Z
#> 

prep_data(
  run_info = run_info,
  input_data = df,
  combo_variables = "id",
  target_variable = "y",
  date_type = "month",
  forecast_horizon = 1,
  external_regressors = "x",
  hist_start_date = min(df$Date),
  hist_end_date = max(df$Date),
  fiscal_year_start = 11
)
#> ℹ Prepping Data
#> ✔ Prepping Data [1.8s]
#> 

df_r1_fiscal_11 <- get_prepped_data(run_info = run_info, recipe = "R1") |>
  select(Date, Date_year, Date_half, Date_quarter, Date_month, Date_month.lbl)

df_r1_fiscal_11
#> # A tibble: 37 × 6
#>    Date       Date_year Date_half Date_quarter Date_month Date_month.lbl
#>    <date>         <dbl>     <dbl>        <dbl>      <dbl> <chr>         
#>  1 2020-11-01      2021         2            3          9 September     
#>  2 2020-12-01      2021         2            4         10 October       
#>  3 2021-01-01      2021         2            4         11 November      
#>  4 2021-02-01      2021         2            4         12 December      
#>  5 2021-03-01      2022         1            1          1 January       
#>  6 2021-04-01      2022         1            1          2 February      
#>  7 2021-05-01      2022         1            1          3 March         
#>  8 2021-06-01      2022         1            2          4 April         
#>  9 2021-07-01      2022         1            2          5 May           
#> 10 2021-08-01      2022         1            2          6 June          
#> # ℹ 27 more rows

df_r2_fiscal_11 <- get_prepped_data(run_info = run_info, recipe = "R2") |>
  select(Date, Date_year, Date_half, Date_quarter, Date_month, Date_month.lbl)

df_r2_fiscal_11
#> # A tibble: 37 × 6
#>    Date       Date_year Date_half Date_quarter Date_month Date_month.lbl
#>    <date>         <dbl>     <dbl>        <dbl>      <dbl> <chr>         
#>  1 2020-11-01      2021         2            3          9 September     
#>  2 2020-12-01      2021         2            4         10 October       
#>  3 2021-01-01      2021         2            4         11 November      
#>  4 2021-02-01      2021         2            4         12 December      
#>  5 2021-03-01      2022         1            1          1 January       
#>  6 2021-04-01      2022         1            1          2 February      
#>  7 2021-05-01      2022         1            1          3 March         
#>  8 2021-06-01      2022         1            2          4 April         
#>  9 2021-07-01      2022         1            2          5 May           
#> 10 2021-08-01      2022         1            2          6 June          
#> # ℹ 27 more rows

I would need to change the fiscal_year_start value to 3 to correct my Date_half and Date_quarter values to what is needed when my fiscal_year_start is actually November. Now Date_half and Date_quarter values are 1 for November, December, and January which is correct when the fiscal year starts in November. However, the Date_month.lbl is "January" which is still incorrect:

run_info <- set_run_info(
  experiment_name = "Run 1",
  run_name = "Date Extraction Check"
)
#> Finn Submission Info
#> • Experiment Name: Run 1
#> • Run Name: Date Extraction Check-20240404T110930Z
#> 

prep_data(
  run_info = run_info,
  input_data = df,
  combo_variables = "id",
  target_variable = "y",
  date_type = "month",
  forecast_horizon = 1,
  external_regressors = "x",
  hist_start_date = min(df$Date),
  hist_end_date = max(df$Date),
  fiscal_year_start = 3
)
#> ℹ Prepping Data
#> ✔ Prepping Data [793ms]
#> 

df_r1_fiscal_3 <- get_prepped_data(run_info = run_info, recipe = "R1") |>
  select(Date, Date_year, Date_half, Date_quarter, Date_month, Date_month.lbl)

df_r1_fiscal_3
#> # A tibble: 37 × 6
#>    Date       Date_year Date_half Date_quarter Date_month Date_month.lbl
#>    <date>         <dbl>     <dbl>        <dbl>      <dbl> <chr>         
#>  1 2020-11-01      2021         1            1          1 January       
#>  2 2020-12-01      2021         1            1          2 February      
#>  3 2021-01-01      2021         1            1          3 March         
#>  4 2021-02-01      2021         1            2          4 April         
#>  5 2021-03-01      2021         1            2          5 May           
#>  6 2021-04-01      2021         1            2          6 June          
#>  7 2021-05-01      2021         2            3          7 July          
#>  8 2021-06-01      2021         2            3          8 August        
#>  9 2021-07-01      2021         2            3          9 September     
#> 10 2021-08-01      2021         2            4         10 October       
#> # ℹ 27 more rows

df_r2_fiscal_3 <- get_prepped_data(run_info = run_info, recipe = "R2") |>
  select(Date, Date_year, Date_half, Date_quarter, Date_month, Date_month.lbl)

df_r2_fiscal_3
#> # A tibble: 37 × 6
#>    Date       Date_year Date_half Date_quarter Date_month Date_month.lbl
#>    <date>         <dbl>     <dbl>        <dbl>      <dbl> <chr>         
#>  1 2020-11-01      2021         1            1          1 January       
#>  2 2020-12-01      2021         1            1          2 February      
#>  3 2021-01-01      2021         1            1          3 March         
#>  4 2021-02-01      2021         1            2          4 April         
#>  5 2021-03-01      2021         1            2          5 May           
#>  6 2021-04-01      2021         1            2          6 June          
#>  7 2021-05-01      2021         2            3          7 July          
#>  8 2021-06-01      2021         2            3          8 August        
#>  9 2021-07-01      2021         2            3          9 September     
#> 10 2021-08-01      2021         2            4         10 October       
#> # ℹ 27 more rows
Created on 2024-04-04 with [reprex v2.1.0](https://reprex.tidyverse.org/)
Session Info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       macOS Sonoma 14.4.1
#>  system   x86_64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Toronto
#>  date     2024-04-04
#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package        * version    date (UTC) lib source
#>  anytime          0.3.9      2020-08-27 [1] CRAN (R 4.3.0)
#>  bit              4.0.5      2022-11-15 [1] CRAN (R 4.3.0)
#>  bit64            4.0.5      2020-08-30 [1] CRAN (R 4.3.0)
#>  class            7.3-22     2023-05-03 [1] CRAN (R 4.3.3)
#>  cli            * 3.6.2      2023-12-11 [1] CRAN (R 4.3.0)
#>  codetools        0.2-20     2024-03-31 [1] CRAN (R 4.3.2)
#>  colorspace       2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
#>  crayon           1.5.2      2022-09-29 [1] CRAN (R 4.3.0)
#>  Cubist         * 0.4.2.1    2023-03-09 [1] CRAN (R 4.3.0)
#>  curl             5.2.1      2024-03-01 [1] CRAN (R 4.3.2)
#>  data.table       1.15.4     2024-03-30 [1] CRAN (R 4.3.2)
#>  dials          * 1.2.1      2024-02-22 [1] CRAN (R 4.3.2)
#>  DiceDesign       1.10       2023-12-07 [1] CRAN (R 4.3.0)
#>  digest         * 0.6.35     2024-03-11 [1] CRAN (R 4.3.2)
#>  distributional   0.4.0      2024-02-07 [1] CRAN (R 4.3.2)
#>  doParallel     * 1.0.17     2022-02-07 [1] CRAN (R 4.3.0)
#>  dplyr          * 1.1.4      2023-11-17 [1] CRAN (R 4.3.0)
#>  earth          * 5.3.3      2024-02-26 [1] CRAN (R 4.3.2)
#>  ellipsis         0.3.2      2021-04-29 [1] CRAN (R 4.3.0)
#>  evaluate         0.23       2023-11-01 [1] CRAN (R 4.3.1)
#>  fabletools       0.4.1      2024-03-02 [1] CRAN (R 4.3.2)
#>  fansi            1.0.6      2023-12-08 [1] CRAN (R 4.3.0)
#>  fastmap          1.1.1      2023-02-24 [1] CRAN (R 4.3.0)
#>  feasts           0.3.2      2024-03-15 [1] CRAN (R 4.3.2)
#>  finnts         * 0.4.0      2023-12-01 [1] CRAN (R 4.3.0)
#>  forcats        * 1.0.0      2023-01-29 [1] CRAN (R 4.3.0)
#>  foreach        * 1.5.2      2022-02-02 [1] CRAN (R 4.3.0)
#>  forecast       * 8.22.0     2024-03-04 [1] CRAN (R 4.3.2)
#>  Formula        * 1.2-5      2023-02-24 [1] CRAN (R 4.3.0)
#>  fracdiff         1.5-3      2024-02-01 [1] CRAN (R 4.3.2)
#>  fs             * 1.6.3      2023-07-20 [1] CRAN (R 4.3.0)
#>  furrr            0.3.1      2022-08-15 [1] CRAN (R 4.3.0)
#>  future           1.33.2     2024-03-26 [1] CRAN (R 4.3.2)
#>  future.apply     1.11.2     2024-03-28 [1] CRAN (R 4.3.2)
#>  generics         0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
#>  ggplot2        * 3.5.0      2024-02-23 [1] CRAN (R 4.3.2)
#>  glmnet         * 4.1-8      2023-08-22 [1] CRAN (R 4.3.0)
#>  globals          0.16.3     2024-03-08 [1] CRAN (R 4.3.2)
#>  glue             1.7.0      2024-01-09 [1] CRAN (R 4.3.0)
#>  gower            1.0.1      2022-12-22 [1] CRAN (R 4.3.0)
#>  GPfit            1.0-8      2019-02-08 [1] CRAN (R 4.3.0)
#>  gtable           0.3.4      2023-08-21 [1] CRAN (R 4.3.0)
#>  hardhat          1.3.1      2024-02-02 [1] CRAN (R 4.3.2)
#>  hms              1.1.3      2023-03-21 [1] CRAN (R 4.3.0)
#>  htmltools        0.5.8      2024-03-25 [1] CRAN (R 4.3.2)
#>  hts            * 6.0.2      2021-05-30 [1] CRAN (R 4.3.0)
#>  ipred            0.9-14     2023-03-09 [1] CRAN (R 4.3.0)
#>  iterators      * 1.0.14     2022-02-05 [1] CRAN (R 4.3.0)
#>  kernlab        * 0.9-32     2023-01-31 [1] CRAN (R 4.3.0)
#>  knitr            1.45       2023-10-30 [1] CRAN (R 4.3.1)
#>  lattice        * 0.22-6     2024-03-20 [1] CRAN (R 4.3.2)
#>  lava             1.8.0      2024-03-05 [1] CRAN (R 4.3.2)
#>  lhs              1.1.6      2022-12-17 [1] CRAN (R 4.3.0)
#>  lifecycle        1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
#>  listenv          0.9.1      2024-01-29 [1] CRAN (R 4.3.2)
#>  lmtest           0.9-40     2022-03-21 [1] CRAN (R 4.3.0)
#>  lubridate      * 1.9.3      2023-09-27 [1] CRAN (R 4.3.0)
#>  magrittr         2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
#>  MASS             7.3-60.0.1 2024-01-13 [1] CRAN (R 4.3.3)
#>  Matrix         * 1.6-5      2024-01-11 [1] CRAN (R 4.3.3)
#>  modeltime      * 1.2.8      2023-09-02 [1] CRAN (R 4.3.0)
#>  munsell          0.5.1      2024-04-01 [1] CRAN (R 4.3.2)
#>  nlme             3.1-164    2023-11-27 [1] CRAN (R 4.3.3)
#>  nnet             7.3-19     2023-05-03 [1] CRAN (R 4.3.3)
#>  padr             0.6.2      2022-11-23 [1] CRAN (R 4.3.0)
#>  parallelly       1.37.1     2024-02-29 [1] CRAN (R 4.3.2)
#>  parsnip        * 1.2.1      2024-03-22 [1] CRAN (R 4.3.2)
#>  pillar           1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
#>  pkgconfig        2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
#>  plotmo         * 3.6.3      2024-02-26 [1] CRAN (R 4.3.2)
#>  plotrix        * 3.8-4      2023-11-10 [1] CRAN (R 4.3.1)
#>  plyr             1.8.9      2023-10-02 [1] CRAN (R 4.3.0)
#>  prodlim          2023.08.28 2023-08-28 [1] CRAN (R 4.3.0)
#>  purrr          * 1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
#>  quadprog         1.5-8      2019-11-20 [1] CRAN (R 4.3.0)
#>  quantmod         0.4.26     2024-02-14 [1] CRAN (R 4.3.2)
#>  R.cache          0.16.0     2022-07-21 [1] CRAN (R 4.3.0)
#>  R.methodsS3      1.8.2      2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo             1.26.0     2024-01-24 [1] CRAN (R 4.3.2)
#>  R.utils          2.12.3     2023-11-18 [1] CRAN (R 4.3.0)
#>  R6               2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
#>  Rcpp             1.0.12     2024-01-09 [1] CRAN (R 4.3.2)
#>  RcppParallel     5.1.7      2023-02-27 [1] CRAN (R 4.3.0)
#>  readr          * 2.1.5      2024-01-10 [1] CRAN (R 4.3.0)
#>  recipes        * 1.0.10     2024-02-18 [1] CRAN (R 4.3.2)
#>  reprex           2.1.0      2024-01-11 [1] CRAN (R 4.3.0)
#>  reshape2         1.4.4      2020-04-09 [1] CRAN (R 4.3.0)
#>  rlang            1.1.3      2024-01-10 [1] CRAN (R 4.3.0)
#>  rmarkdown        2.26       2024-03-05 [1] CRAN (R 4.3.2)
#>  rpart            4.1.23     2023-12-05 [1] CRAN (R 4.3.3)
#>  rsample          1.2.1      2024-03-25 [1] CRAN (R 4.3.2)
#>  rstudioapi       0.16.0     2024-03-24 [1] CRAN (R 4.3.2)
#>  rules          * 1.0.2      2023-03-08 [1] CRAN (R 4.3.0)
#>  scales         * 1.3.0      2023-11-28 [1] CRAN (R 4.3.1)
#>  sessioninfo      1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
#>  shape            1.4.6.1    2024-02-23 [1] CRAN (R 4.3.2)
#>  slider           0.3.1      2023-10-12 [1] CRAN (R 4.3.1)
#>  SparseM          1.81       2021-02-18 [1] CRAN (R 4.3.0)
#>  StanHeaders      2.32.6     2024-03-01 [1] CRAN (R 4.3.2)
#>  stringi          1.8.3      2023-12-11 [1] CRAN (R 4.3.1)
#>  stringr        * 1.5.1      2023-11-14 [1] CRAN (R 4.3.1)
#>  styler           1.10.2     2023-08-29 [1] CRAN (R 4.3.0)
#>  survival         3.5-8      2024-02-14 [1] CRAN (R 4.3.3)
#>  tibble         * 3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
#>  tidyr          * 1.3.1      2024-01-24 [1] CRAN (R 4.3.2)
#>  tidyselect     * 1.2.1      2024-03-11 [1] CRAN (R 4.3.2)
#>  tidyverse      * 2.0.0      2023-02-22 [1] CRAN (R 4.3.0)
#>  timechange       0.3.0      2024-01-18 [1] CRAN (R 4.3.0)
#>  timeDate         4032.109   2023-12-14 [1] CRAN (R 4.3.1)
#>  timetk         * 2.9.0      2023-10-31 [1] CRAN (R 4.3.0)
#>  tseries          0.10-55    2023-12-06 [1] CRAN (R 4.3.0)
#>  tsibble          1.1.4      2024-01-29 [1] CRAN (R 4.3.2)
#>  TTR              0.24.4     2023-11-28 [1] CRAN (R 4.3.1)
#>  tune           * 1.2.0      2024-03-20 [1] CRAN (R 4.3.2)
#>  tzdb             0.4.0      2023-05-12 [1] CRAN (R 4.3.0)
#>  urca             1.3-3      2022-08-29 [1] CRAN (R 4.3.0)
#>  utf8             1.2.4      2023-10-22 [1] CRAN (R 4.3.1)
#>  vctrs            0.6.5      2023-12-01 [1] CRAN (R 4.3.1)
#>  vroom          * 1.6.5      2023-12-05 [1] CRAN (R 4.3.0)
#>  warp             0.2.1      2023-11-02 [1] CRAN (R 4.3.1)
#>  withr            3.0.0      2024-01-16 [1] CRAN (R 4.3.0)
#>  workflows      * 1.1.4      2024-02-19 [1] CRAN (R 4.3.2)
#>  xfun             0.43       2024-03-25 [1] CRAN (R 4.3.2)
#>  xts              0.13.2     2024-01-21 [1] CRAN (R 4.3.0)
#>  yaml             2.3.8      2023-12-11 [1] CRAN (R 4.3.1)
#>  yardstick        1.3.1      2024-03-21 [1] CRAN (R 4.3.3)
#>  zoo              1.8-12     2023-04-13 [1] CRAN (R 4.3.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@mitokic
Copy link
Collaborator

mitokic commented Apr 10, 2024

Hey @AndrewKostandy, thanks for logging this. Let me take a deeper look and get back to you.

@mitokic mitokic self-assigned this Apr 10, 2024
@mitokic mitokic added the bug Something isn't working label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants