Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loglike_to_density is wrong #11

Closed
annecori opened this issue May 29, 2019 · 10 comments
Closed

loglike_to_density is wrong #11

annecori opened this issue May 29, 2019 · 10 comments

Comments

@annecori
Copy link
Collaborator

line 60:
out <- x / abs(max(x, na.rm = TRUE))
should not be there I think (you can't renormalise like this on the log scale as this is equivalent to raising to a power on the natural scale - this distorts your likelihood profile massively and gives a wrong impression of the likelihood landscape - this will be important to fix if we want to use this to compute confidence intervals on R as suggested in issue #10

@zkamvar
Copy link
Member

zkamvar commented May 29, 2019

Note: related to #7

@thibautjombart
Copy link
Contributor

In the current version I see something different, but I suspect the problem remains:

loglike_to_density <- function(x) {
  out <- x
  x_max <- max(x, na.rm = TRUE)
  if (x_max != 0) {
    out <- out / abs(x_max)
  }
  out <- exp(out)
  out <- out / sum(out)
  out
}

https://github.com/reconhub/earlyR/blob/master/R/internals.R#L56-L68

Any suggestion welcome.

@annecori
Copy link
Collaborator Author

loglike_to_density <- function(x) {
  out <- exp(x)
  out <- out / sum(out)
  out
}

@thibautjombart
Copy link
Contributor

Thanks for getting involved - really appreciate :)
I think this was the original version. I seem to remember that outright exp(x) was giving 0 for really low log-like, but that was a numeric issue.

@zkamvar
Copy link
Member

zkamvar commented May 29, 2019

A bit of an update and reproducible example that @annecori gave me:

This is what the plot should look like:

This is what it currently looks like:

library(incidence)
library(earlyR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

data  <- outbreaks::ebola_sim
params <- list(shape = 1.88822148063956, scale = 4.5613778865727)
si     <- distcrete::distcrete("gamma", 
                               shape = params$shape, 
                               scale = params$scale, 
                               interval = 1L, w = 0L)
threshold_date <- as.Date("2014-07-01")

linelist <- data$linelist[data$linelist$date_of_hospitalisation <= threshold_date, ]
# remove any outcome dates that are after present
linelist$date_of_outcome[linelist$date_of_outcome > threshold_date] <- NA


# clean dates
mistakes <- which(linelist$date_of_onset <= linelist$date_of_infection)
linelist_clean <- linelist[-mistakes, ]

# compute incidence
i <- incidence(linelist_clean$date_of_onset,
               last_date = as.Date(max(linelist_clean$date_of_hospitalisation, na.rm = TRUE)))
# truncate to avoid right censoring issue
min_date <- min(i$dates)
max_date <- max(i$dates)
i_trunc <- subset(i,
                  from = min_date,
                  to = max_date - 2*7) # remove last two weeks of data
i_trunc
#> <incidence object>
#> [127 cases from days 2014-04-07 to 2014-06-17]
#> 
#> $counts: matrix with 72 rows and 1 columns
#> $n: 127 cases in total
#> $dates: 72 dates marking the left-side of bins
#> $interval: 1 day
#> $timespan: 72 days
#> $cumulative: FALSE

r <- earlyR::get_R(i_trunc, si = si)
#> Registered S3 methods overwritten by 'ggplot2':
#>   method         from 
#>   [.quosures     rlang
#>   c.quosures     rlang
#>   print.quosures rlang
r
#> 
#> /// Early estimate of reproduction number (R) //
#>  // class: earlyR, list
#> 
#>  // Maximum-Likelihood estimate of R ($R_ml):
#> [1] 1.241241
#> 
#> 
#>  // $lambda:
#>   NA 0.08460694 0.09182106 0.09229611 0.08865351 0.08263049...
#> 
#>  // $dates:
#> [1] "2014-04-07" "2014-04-08" "2014-04-09" "2014-04-10" "2014-04-11"
#> [6] "2014-04-12"
#> ...
#> 
#>  // $si (serial interval):
#> A discrete distribution
#>   name: gamma
#>   parameters:
#>     shape: 1.88822148063956
#>     scale: 4.5613778865727
plot(r)

Created on 2019-05-29 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.0 (2019-04-26)
#>  os       Ubuntu 18.04.2 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_US:en                    
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/London               
#>  date     2019-05-29                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package         * version  date       lib
#>  assertthat        0.2.1    2019-03-21 [1]
#>  backports         1.1.4    2019-04-10 [1]
#>  callr             3.2.0    2019-03-15 [1]
#>  cli               1.1.0    2019-03-19 [1]
#>  coarseDataTools   0.6-3    2019-05-29 [1]
#>  coda              0.19-2   2018-10-08 [1]
#>  colorspace        1.4-1    2019-03-18 [1]
#>  crayon            1.3.4    2017-09-16 [1]
#>  curl              3.3      2019-01-10 [1]
#>  data.table        1.12.2   2019-04-07 [1]
#>  desc              1.2.0    2019-04-12 [1]
#>  devtools          2.0.2    2019-04-08 [1]
#>  digest            0.6.19   2019-05-20 [1]
#>  distcrete         1.0.3    2018-10-16 [1]
#>  dplyr           * 0.8.1    2019-05-14 [1]
#>  earlyR          * 0.0.1    2019-05-29 [1]
#>  EpiEstim          2.1-0    2019-05-29 [1]
#>  evaluate          0.13     2019-02-12 [1]
#>  fitdistrplus      1.0-14   2019-01-23 [1]
#>  fs                1.3.1    2019-05-06 [1]
#>  ggplot2           3.1.1    2019-04-07 [1]
#>  glue              1.3.1    2019-03-12 [1]
#>  gridExtra         2.3      2017-09-09 [1]
#>  gtable            0.3.0    2019-03-25 [1]
#>  highr             0.8      2019-03-20 [1]
#>  htmltools         0.3.6    2017-04-28 [1]
#>  htmlwidgets       1.3      2018-09-30 [1]
#>  httr              1.4.0    2018-12-11 [1]
#>  incidence       * 1.7.0    2019-05-28 [1]
#>  jsonlite          1.6      2018-12-07 [1]
#>  knitr             1.23     2019-05-18 [1]
#>  labeling          0.3      2014-08-23 [1]
#>  lattice           0.20-38  2018-11-04 [1]
#>  lazyeval          0.2.2    2019-03-15 [1]
#>  lsei              1.2-0    2017-10-23 [1]
#>  magrittr          1.5      2014-11-22 [1]
#>  MASS              7.3-51.4 2019-04-26 [1]
#>  Matrix            1.2-17   2019-03-22 [1]
#>  MatrixModels      0.4-1    2015-08-22 [1]
#>  mcmc              0.9-6    2019-03-10 [1]
#>  MCMCpack          1.4-4    2018-09-14 [1]
#>  memoise           1.1.0    2017-04-21 [1]
#>  mime              0.6      2018-10-05 [1]
#>  munsell           0.5.0    2018-06-12 [1]
#>  npsurv            0.4-0    2017-10-14 [1]
#>  nvimcom         * 0.9-82   2019-05-28 [1]
#>  outbreaks         1.5.0    2019-01-21 [1]
#>  pillar            1.4.1    2019-05-28 [1]
#>  pkgbuild          1.0.3    2019-03-20 [1]
#>  pkgconfig         2.0.2    2018-08-16 [1]
#>  pkgload           1.0.2    2018-10-29 [1]
#>  plotly            4.9.0    2019-04-10 [1]
#>  plyr              1.8.4    2016-06-08 [1]
#>  prettyunits       1.0.2    2015-07-13 [1]
#>  processx          3.3.1    2019-05-08 [1]
#>  ps                1.3.0    2018-12-21 [1]
#>  purrr             0.3.2    2019-03-15 [1]
#>  quantreg          5.38     2018-12-18 [1]
#>  R6                2.4.0    2019-02-14 [1]
#>  Rcpp              1.0.1    2019-03-17 [1]
#>  remotes           2.0.4    2019-04-10 [1]
#>  reshape2          1.4.3    2017-12-11 [1]
#>  rlang             0.3.4    2019-04-07 [1]
#>  rmarkdown         1.13.1   2019-05-24 [1]
#>  rprojroot         1.3-2    2018-01-03 [1]
#>  scales            1.0.0    2018-08-09 [1]
#>  sessioninfo       1.1.1    2018-11-05 [1]
#>  SparseM           1.77     2017-04-23 [1]
#>  stringi           1.4.3    2019-03-12 [1]
#>  stringr           1.4.0    2019-02-10 [1]
#>  survival          2.44-1.1 2019-04-01 [1]
#>  testthat          2.1.1    2019-04-23 [1]
#>  tibble            2.1.1    2019-03-16 [1]
#>  tidyr             0.8.3    2019-03-01 [1]
#>  tidyselect        0.2.5    2018-10-11 [1]
#>  usethis           1.5.0    2019-04-07 [1]
#>  viridisLite       0.3.0    2018-02-01 [1]
#>  withr             2.1.2    2018-03-15 [1]
#>  xfun              0.7      2019-05-14 [1]
#>  xml2              1.2.0    2018-01-24 [1]
#>  yaml              2.2.0    2018-07-25 [1]
#>  source                                    
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.3)                            
#>  Github (nickreich/coarseDataTools@bd41515)
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.5.3)                            
#>  Github (r-lib/desc@c860e7b)               
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.6.0)                            
#>  Github (reconhub/distcrete@a35ea1c)       
#>  CRAN (R 3.6.0)                            
#>  local                                     
#>  Github (annecori/EpiEstim@7d40276)        
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.2)                            
#>  Github (reconhub/incidence@51269f1)       
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  local                                     
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.3)                            
#>  Github (rstudio/rmarkdown@5409172)        
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.2)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.3)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.6.0)                            
#>  CRAN (R 3.5.1)                            
#>  CRAN (R 3.5.1)                            
#> 
#> [1] /home/zkamvar/Documents/RLibrary
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

@zkamvar
Copy link
Member

zkamvar commented May 29, 2019

I was able to reproduce the correct version by updating the internal function loglike_to_density to @annecori's version

@zkamvar
Copy link
Member

zkamvar commented May 29, 2019

A few more facts:

This formulation originated with #4. If we switch to #11 (comment), then that issue pops up again.

Of course, one of the two tests implemented specifically for #4 is currently skipped because of #9, so this is an opportunity to rework it and get things right.

@zkamvar zkamvar added the bug label May 29, 2019
@thibautjombart
Copy link
Contributor

I think the issue raised by Anne is more problematic than the original issue #4 so more urgent to use Anne's fix as a priority, and leave a hot issue for the numerical approximation issue. Any tips on this most welcome. ;)

@zkamvar
Copy link
Member

zkamvar commented May 29, 2019

Skipped tests need to be addressed when addressing this.

@zkamvar
Copy link
Member

zkamvar commented May 30, 2019

this was fixed in #12. Thanks @annecori for alerting us to it!

@zkamvar zkamvar closed this as completed May 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants