asmodee() is not (always) deterministic #35

stephaneghozzi · 2020-08-15T15:18:57Z

While looking at a special case, I found outlier detection is not deterministic. In the code below, asmodee is applied 100 times to the same time series, but only a fraction of the trials finds outliers. (Actually none should, see #36.) This fraction itself varied substantially between runs.

This likely due to the way model selection is performed.

Code:

library(trendbreaker)

models <- list(
  poisson_constant = glm_model(count ~ 1, family='poisson'),
  regression = lm_model(count ~ date),
  negbin_time = glm_nb_model(count ~ date)
)

ts <- data.frame(
  date=1:42,
  count=c(2, 2, 2, 2, 1, 1, 2, 2, 2, 0, 1, 0, 0, 0, 1, 0, 1, 1, 2, 1, 1, 0, 1, 2, 1, 2, 2, 2, 0, 0,
    2, 3, 2, 1, 0, 1, 1, 0, 2, 3, 0, 7)
)

i_outlier <- c()
n_trials <- 100
for (j in 1:n_trials) {
  asmodee_res <- asmodee(
    ts,
    models = models,
    alpha = 0,
    max_k = 12,
    method = evaluate_aic
  )
  if (any(asmodee_res$results$outlier)) {
    i_outlier <- c(i_outlier, j)
  }
}
print('Proportion of trials with outliers:')
print(paste0(round(100*length(i_outlier)/n_trials),'%'))

Output:

[1] "Proportion of trials with outliers:"
[1] "32%"

(The percentage will vary from run to run)

The text was updated successfully, but these errors were encountered:

stephaneghozzi mentioned this issue Aug 15, 2020

The upper and lower bounds are not (always) properly computed for alpha = 0 #36

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

asmodee() is not (always) deterministic #35

asmodee() is not (always) deterministic #35

stephaneghozzi commented Aug 15, 2020 •

edited

Loading

asmodee() is not (always) deterministic #35

asmodee() is not (always) deterministic #35

Comments

stephaneghozzi commented Aug 15, 2020 • edited Loading

stephaneghozzi commented Aug 15, 2020 •

edited

Loading