From bd6d033b54a0e3935f47426bec57e7cc1a3448b0 Mon Sep 17 00:00:00 2001 From: Zachary Susswein <46581799+zsusswein@users.noreply.github.com> Date: Thu, 12 Sep 2024 11:44:41 -0400 Subject: [PATCH] Add simulated data from Gostic, 2020 for benchmarking (#37) * Add simulated data from Gostic, 2020 for benchmarking This commit re-uses the data processing and documentation from cdcgov/cfa-epinow2-pipeline#17. That repo is open source and public domain by nature of being USG property. I think it's convenient to re-use, but @athowes and @kgostic there's a little text in there that you both suggested, so I'd appreciate if you could give your permission for the re-use here! I've made sure to credit you both in the commit as it's partially your writing. It's probably best practice to make the data prep and processing here fully reproducible in `data-raw/` but it's quite a pain to do, with a mixture of shells cripting and R scripting needed. I skipped it out of convenience, but if you have a really strong feeling that it's required @seabbs, let me know and we can revisit. Closes #9 Co-authored-by: Adam Howes Co-authored-by: Katie Gostic * Bump NEWS * Coerce new `obs_incidence` col to integer from double * Generate GI PMF in `data-raw/` * Point to specific CFAEpiNow2Pipeline commit * Remove copied & pasted unrelated text * Re-render roxygen docs * Add primarycensoreddist to remotes * Typo * Rename synthetic dataset and GI dist * Pin primarycensorreddist to R-universe not github --------- Co-authored-by: Adam Howes Co-authored-by: Katie Gostic --- .Rbuildignore | 1 + .github/workflows/R-CMD-check.yaml | 1 + .github/workflows/pkgdown.yaml | 1 + .github/workflows/test-coverage.yaml | 1 + .pre-commit-config.yaml | 1 - DESCRIPTION | 11 +++- NEWS.md | 1 + R/data.R | 76 +++++++++++++++++++++++++++ data-raw/sir_gt_pmf.R | 21 ++++++++ data/sir_gt_pmf.rda | Bin 0 -> 387 bytes data/stochastic_sir_rt.rda | Bin 0 -> 10678 bytes man/sir_gt_pmf.Rd | 30 +++++++++++ man/stochastic_sir_rt.Rd | 64 ++++++++++++++++++++++ 13 files changed, 205 insertions(+), 3 deletions(-) create mode 100644 R/data.R create mode 100644 data-raw/sir_gt_pmf.R create mode 100644 data/sir_gt_pmf.rda create mode 100644 data/stochastic_sir_rt.rda create mode 100644 man/sir_gt_pmf.Rd create mode 100644 man/stochastic_sir_rt.Rd diff --git a/.Rbuildignore b/.Rbuildignore index fbf4d1a..377cd5b 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -5,5 +5,6 @@ ^\.pre-commit-config\.yaml$ ^_pkgdown\.yml$ ^codecov\.yml$ +^data-raw$ ^docs$ ^pkgdown$ diff --git a/.github/workflows/R-CMD-check.yaml b/.github/workflows/R-CMD-check.yaml index e8f99b6..f7ad93e 100644 --- a/.github/workflows/R-CMD-check.yaml +++ b/.github/workflows/R-CMD-check.yaml @@ -37,6 +37,7 @@ jobs: r-version: ${{ matrix.config.r }} http-user-agent: ${{ matrix.config.http-user-agent }} use-public-rspm: true + extra-repositories: 'https://epinowcast.r-universe.dev' - uses: r-lib/actions/setup-r-dependencies@v2 with: diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml index a7276e8..0d8fc9b 100644 --- a/.github/workflows/pkgdown.yaml +++ b/.github/workflows/pkgdown.yaml @@ -29,6 +29,7 @@ jobs: - uses: r-lib/actions/setup-r@v2 with: use-public-rspm: true + extra-repositories: 'https://epinowcast.r-universe.dev' - uses: r-lib/actions/setup-r-dependencies@v2 with: diff --git a/.github/workflows/test-coverage.yaml b/.github/workflows/test-coverage.yaml index e8a8471..21dbfa2 100644 --- a/.github/workflows/test-coverage.yaml +++ b/.github/workflows/test-coverage.yaml @@ -20,6 +20,7 @@ jobs: - uses: r-lib/actions/setup-r@v2 with: use-public-rspm: true + extra-repositories: 'https://epinowcast.r-universe.dev' - uses: r-lib/actions/setup-r-dependencies@v2 with: diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 76ffbab..6f2db2c 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -8,7 +8,6 @@ repos: - id: style-files args: [--style_pkg=styler, --style_fun=tidyverse_style, --cache-root=styler-perm] - - id: roxygenize - id: use-tidy-description - id: lintr - id: readme-rmd-rendered diff --git a/DESCRIPTION b/DESCRIPTION index d340d3e..99238e1 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -25,13 +25,20 @@ BugReports: https://github.com/cdcgov/cfa-gam-rt/issues Suggests: testthat (>= 3.0.0), pkgdown, - withr + withr, + usethis, + primarycensoreddist Config/testthat/edition: 3 Encoding: UTF-8 Roxygen: list(markdown = TRUE) -RoxygenNote: 7.3.1 +RoxygenNote: 7.3.2 Imports: cli, glue, mgcv, rlang +Depends: + R (>= 2.10) +LazyData: true +Additional_repositories: + https://epinowcast.r-universe.dev diff --git a/NEWS.md b/NEWS.md index 8253468..df13e70 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,6 @@ # RtGam (development version) +* Add data from Gostic, 2020 for testing and benchmarking (#37) * Fix math rendering in pkgdown (#36) * CI status in README badges (#27) * Update pre-commit hooks from template repo to newest version (#25) diff --git a/R/data.R b/R/data.R new file mode 100644 index 0000000..50901eb --- /dev/null +++ b/R/data.R @@ -0,0 +1,76 @@ +#' Synthetic dataset of stochastic SIR system with known Rt +#' +#' A dataset from Gostic, Katelyn M., et al. "Practical considerations for +#' measuring the effective reproductive number, Rt." PLoS Computational Biology +#' 16.12 (2020): e1008409. The data are simulated from a stochastic SEIR +#' compartmental model. +#' +#' This synthetic dataset has a number of desirable properties: +#' +#' 1. The force of infection changes depending on the Rt, allowing for sudden +#' changes in the Rt. This allows for modeling of sudden changes in infection +#' dynamics, which might otherwise be difficult to capture. Rt estimation +#' framework +#' +#' 2. The realized Rt is known at each timepoint +#' +#' 3. The dataset incorporates a simple generation interval and a reporting +#' delay. +#' +#' Gostic et al. benchmark the performance of a number of Rt estimation +#' frameworks, providing practical guidance on how to use this dataset to +#' evaluate Rt estimates. +#' +#' In practice, we've found that the amount of observation noise in the +#' incidence and/or observed cases is often undesirably low for testing. Many +#' empirical datasets are much noisier. As a result, models built with these +#' settings in mind can perform poorly on this dataset or fail to converge. We +#' manually add observation noise with `rnbinom(299, mu = +#' stochastic_sir_rt[["obs_cases"]], size = 10)` and the random seed 123456 and +#' store it in the `obs_incidence` column. +#' +#' @name stochastic_sir_rt +#' @format `stochastic_sir_rt` A data frame with 301 rows and 12 columns: +#' \describe{ +#' \item{time}{Timestep of the discrete-time stochastic SEIR simulation} +#' \item{date}{Added from the original Gostic, 2020 dataset. A date +#' corresponding to the assigned `time`. Arbitrarily starts on January 1st, +#' 2023.} +#' \item{S, E, I, R}{The realized state of the stochastic SEIR system} +#' \item{dS, dEI, DIR}{The stochastic transition between compartments} +#' \item{incidence}{The true incidence in the `I` compartment at time t} +#' \item{obs_cases}{The observed number of cases at time t from +#' forward-convolved incidence.} +#' \item{obs_incidence}{Added from the original Gostic, 2020 dataset. The +#' `incidence` column with added negative-binomial observation noise. +#' Created with `set.seed(123456)` and the call +#' `rnbinom(299, mu = [["incidence"]], size = 10)` Useful for +#' testing.} +#' \item{true_r0}{The initial R0 of the system (i.e., 2)} +#' \item{true_rt}{The known, true Rt of the epidemic system} +#' } +#' @source +#' # nolint +#' # nolint +"stochastic_sir_rt" + +#' Generation interval corresponding to the sample `stochastic_sir_rt` dataset +#' +#' Gostic et al., 2020 simulates data from a stochastic SEIR model. Residence +#' time in both the E and the I compartments is exponentially distributed, with +#' a mean of 4 days (or a rate/inverse-scale of 1/4). These residence times +#' imply a gamma-distributed generation time distribution with a shape of 2 and +#' a rate of 1/4. The distribution can be regenerated in +#' `data-raw/sir_gt_pmf.R`. +#' +#' From this parametric specification, we produce a double-censored, +#' left-truncated probability mass function of the generation interval +#' distribution. We produce the PMF using +#' [primarycensoreddist::dpcens()] with version 0.4.0. See +#' https://doi.org/10.1101/2024.01.12.24301247 for more information on +#' double-censoring biases and corrections. +#' +#' @name sir_gt_pmf +#' @format `sir_gt_pmf` A numeric vector of length 26 that sums to one within +#' numerical tolerance +"sir_gt_pmf" diff --git a/data-raw/sir_gt_pmf.R b/data-raw/sir_gt_pmf.R new file mode 100644 index 0000000..37d0ca0 --- /dev/null +++ b/data-raw/sir_gt_pmf.R @@ -0,0 +1,21 @@ +# E and I compartments both with exponentially distributed residence times +# with a mean of 4 days. +shape <- 2 +rate <- 1 / 4 + +sir_gt_pmf <- primarycensoreddist::dpcens(0:26, + pgamma, + shape = shape, + rate = rate, + D = 27 +) # v0.4.0 + +# Drop first element because GI can't have same-day transmission +sir_gt_pmf <- sir_gt_pmf[2:27] + +# Renormalize to a proper PMF +while (abs(sum(sir_gt_pmf) - 1) > 1e-10) { + sir_gt_pmf <- sir_gt_pmf / sum(sir_gt_pmf) +} + +usethis::use_data(sir_gt_pmf, overwrite = TRUE) diff --git a/data/sir_gt_pmf.rda b/data/sir_gt_pmf.rda new file mode 100644 index 0000000000000000000000000000000000000000..1fa8bf265ed926eb15f02eb3b11e542653eb71cd GIT binary patch literal 387 zcmV-}0et>KT4*^jL0KkKS$~$3djJ4D|Ns6u4){IY85O22k#Ci={mpKikIwyFKf*9I zO^wiq>De#<+hA2s8i+K|27u5S4FJiJlRz2`27^OGL8A%k4G+`_>OCXW%@O*h^-uLs zUPBS8s48pz-YK?lirB>NIKi1^yZC?eTmuwrel zRQ*rr5S`@w16NvwgfsAah6Uj_g!p9f-nWBuG@bf5?B%$J;|L&DbNFl!@6yipeZ$F5 zx=Y))9aqpnOh__m%?$WZ6@?x_jEs*%^P@>jYZJHA9h66M>Q*BTf{7}%n=VXI%F4q+ zzG>u`LkCG=Xrz?b;cpNfEf*JViTJdnv;dV}$JC2p|0T^3HfT7nEp_t*|I@ hdzIJVV22q)^cZLxfHMN-m?jhaUC9*TLP7poP3%6At#<$b literal 0 HcmV?d00001 diff --git a/data/stochastic_sir_rt.rda b/data/stochastic_sir_rt.rda new file mode 100644 index 0000000000000000000000000000000000000000..526660facb829b7b8335b251ef5d07b1517de289 GIT binary patch literal 10678 zcmb_?MNB0O%q=#!%i!)h$i>~A3tSuq8{A=VcXxMpcXxM(!C{b#yTbtgw|JXBdCA+n zrpd`^+D&^lZ7QZ?!OJG9O#{+S5i~@CAiDMY`hVFT=@#*ISDp~`gatuyGr(rP70nw0 zf|aBn!kNrpO#-dQM?J8ccxv1dElWRvHW1Lchv#P6^q={&@s}AQ};=+)_m#kSrZGpyIGBp%fuW3O0NqX}l~+ zK`~zuh*40gNOw=QQJSl$c$S3)R1^m++bR;O{VOZZwaPf7$t{ZjJXM_8t8Ex5EW(!1 zKp0yV089n|3*i=nmZcf&1OOV@Hw7%U57rC?PLhaH12F$;RMb=WYDY#`_bYP0Cy0(V)7$^1Ee7O zN@sCrR_IF;pfDgxQ=Uv)k+cA$%C(>tOcBgXFH2e;EvqQUZkdo?y?hX(KS}1=M_^;SAefd$qDwqCL zu;N7JRklhr=S z0#%~y|7X8@P1nEf`Es~r^Jw3?@)Ivw{Mz{%SUPj@-1b;k(B`mMHEmw<=omx&sWb=d z$}ez#ynmy^cCaK@9g0x=&M7FV;91#$lhU$w(!NU6FUz?HllGgxBr1X_06G@YafPv< zy_zb84JKS?k@|a$94E7GgaWk;0;VBmO|BjnqU8KBZ)#^n_jwj1$}A>!JL-zKvaFA z_BkdvS_Bv|3i1TBLJY{0P>S%(1ylBNzslwO*fpw-WHoHvirZSGiJ#CIQFT}d2p;?^ zX(O5t5ah^8B?W~dbf;Je8Uz%MO!SduEPp$kw$(VdS0j545>_msHG^Z}H%aPoY6%zg zQV`$n(~O3-2!0bDtkQ8+GF5NGelD)*qp7$SIA^w`q*u4%R|S-(1Z16$l?D3ax58Hr|+UmqUB7G;+itSvS#oqG7`e77_d5Q z2$fQLE2@@t7UVPu@QMI_hfE17a_r`%`QvRY&5K6mB_`}lsa&yVJN5)dA#C^XqdT=s0L6vpo9 z^n^2P=>2gVn`*%m%4bI5ze%!xrEQoacbH#fX$^=TS+kYWrH&r{j2ypJi+?6D{1xit zh-N#Q{%jIOzq7lG#FDaNn~T6U+YK88eP}m|3l3J)-ao69X9AInIo#9 zZ48x9o>j}1zH-3msabapS$Q359|Ib%WK~xn#`?x^;$<5V`E4OTmxo*elsS8=EW7dj zR&YZG@0aAjd=yr7`lt%r5Y^pVAWnsif zFKR4teUX6@?0%Wc7gR`}5ly;Hb|wg2f7CFD1V2&u-U+-wybi>1F7WmK)lUwuDWoEZ zSSZVHk6kyNJ!8h#!T1QqGm(o1PknGvLnTz~g;Aa&%J8V_jptRY@`puE5y;5LKDcM0 z^yW{!fwlT3@qa7pQMK?%3?Qb*P(=0gw<}cK5A*Eg6o4c+Up-wGi7>ox^_(1?4lEff zZ}1Z#x2giVe6X;J9J-VYPHXaoxy5M4)KeQG7rF~`q{`C;J}Yz~bek1X^({RxMT>ue zV*A|LNK?ck?_q6zhcHQ!?dEH3WMbVztXhcmtj)q5b?(O-#VWr=%{m$g$K!#Tz*irN zr*SO96%&ois_U(6vE8~=ki4q&-1_w&Hf&{QWkDQEly-QrFu7}$2eFsve8P&MNns|S zI2%LvB>gr*;4nA5K|@`UR`P9iBCqk&#<(O$%Sb@g(7l|1(NK-!18)|7p?9$H2gRpa z^qkv9%jkQ$uX@AKt)aIx2aL>EcUqhVs@c_Vf8`7w1X`!mE*&q}on%Mi3)Rs?Z{PUL z$h+7ihzHBBfPv=bTk)8K{6ux@WDpKXKr|!nu-0duBq+HwD!iB!;R`+s&sHs{?5wQ*(Bx*Sq(07FS2ugPjx3t^%O^K6==CViwxehfPjs`WE00fFVRr(< zL3Yx3J}L>H^pHHC`BxNY5*r?FQNBr%1rj5J+pYej*hRae1dArjxGX&~Q_3#&`_>OH19TyDR(4 z`WF2S>?Uv6C6Jm-Cp5G}H*(Ye5r}$|SvQRzh9m9@Z-Z0h3NHXDGGQW5s8DkY`^N4^ z*w;{|cBo>7#xwVxx;@LAm~Z$al%GClt~!gx8s~ntqlA|ql+D`P!)e++B0I`SJ+Uqn zQZzqn0i^kY%iC2)VS$A=BRJ18p_68>|4k6$OA0^FCl&b>)nuvW5Yce6hBRW^Jc?{^ z8oN8lU$|&+7XO_Bf)1If2g84gymu>Rhm|y$P@ymhbYpbW%>MDS}q+P(#5uq z6c6uNv%2zGYBC`e2h%P`NpQQ;*0Mp!RtNqg&RELB-Yw~6v&b4MB&MNaZr%>XBoFUG zhL~z#9>g~k&n?LdsKY{Eab4KcT=fA2!Se)bLYSX>3#x~eq{IffJLopC2`ZN8=AJ@B zgzqC;Mc3=AtWudHPfSmUUW+vcQ-qZCq@20i<$H-Iw>sZh3(r(k>osTl&S-azi;lUhat}SGD`x38B29;(xv_ z&>CNG)(2{U)2)A3ZVJ#PvH%O&D{bxG7iiyOHutrIG{c{CnMho?HNjPZD6EEwL<7QgK ziLf8gz7rK90UN9jyl=3kxs<}Z)Wv*MD1mUnNy8=r#r(;<&ZvZfFj>w#%MTd>$sf!7 zdUj0}8_MmT|?W!X_O<5E8HRwZ>FWL^Mj zN65i${%@OgG9Oqsn+q(OA&HDO$lw)r4}=!;E_bHai+1rYM`hMEpTi6(v-wB#T_?3L zo1{NJTzcdkUWZTf*GoP@vf~%7F3<}``y7Yl&GK!*LyhL$vZE*EcJU|L;HGXRB>pEQ zqcDdL?46z`@Lg#`o+vc)i@T6%rW!Z$BgpPT`9yU!G0`q*Z9{qehW_nz@X!2m6XDU4 za+c2}T5sqhm;I+*r@~57j~C1p`aLMcSJeT_JL1LHVlw-r)a=K~#&uJAKZ;sjhg`PuFh3{1PnsUROV6MT98WiOH1V$95 zj4ZP7!k;_czN8tOzNYN1F1>JYW9H(a_~A|+^OX3&b}3hV<54LD|7WaA*wI~_gtEI` zD+~DugHU@;b5!PkV`G+=Ty594$G~u8CFAHU-Kl2F#MsZ_zE;d>j^rc0U|HK``?1^< zYr{0XK$%&o?ahvZRGjl!uB=o^hx=;i7(}0u`^dG0Fb;|oR)`WC>2u~dQ}K>=IGvY^ z?QQ+FtQR-h5N`{!M(sP#;t*S%{>9d2&(+})K~>J%GxnasnN9rD%vmMl zS)b$67KtNYM-W$r>CIrspmP)82IH9);l(gi$01P{&>oEb$s&XXu%O3BqG!aB_yZ^V znPeXU=8HN>aVCQ2Wa{wp$pOe!C`AaGBO>5Ei8o7VYe!i=D(2{TA$F0*HIRzW@|3Wj z^)?}YbePKO^9qDOFxBcEHnix|r!a=G7|hcRgUD?>HJ&vlwwU~rY|ps`=BNzbMPeLj zV1|~}jri!;XSNbrNp5#W@oNN>A3yZ+<-g87)e_aUD^haC5iT88(Nv%oRW~+TtdAEgX;)T%QW|rvlmj1r3I)@VB~Z;*h-e z(9KyTF0u*PFdbgf2uly+Nil>9usYl??IKk9)RZ_~Pcg|R>=d)mlfn`2F|!RYpomW~ zb$Zz6z?qicuA@G)v4t7yDh><{p{ZdQT*Nz3r|wR}Z@v;vJnS>75occVNJ|AmW=fr= zavTd}@wMg22Lr16Otk}Y=ZH+)%r(sUNqU!&BZ;!WWqtBQuJ-el(5)qpPmX;6xXmy9 z=cbm>8C~Q=QS)zr3Wk{>9-Atk+~Li2=!T{FNfY`GFBpcopnuCLjyL57zd#LQM{3&Q z6*-13@s0`iCsoj5n@G?UrgbEd!y<$533O6+;DP-V-?wn>w}*$j6M}@qU1j7ve4@V{$k0|d%lj^FGzjiIEizB+nX82vdg3LB4ht#> zQW2&qFwKhRF*sLZw%4=McvfVRK9=(UPy{P1mbzj)T@ue-vzUMmh`~|F_=uWKChst?8E+r*2Pab&iqPUGKp*) zD9f91`b`4To}NJZK95jA7#K*b{Z{gDtK8n6f*V&9ox#i2X+1WcRv*9t^L+j-k;}lo z(%GAak00VHDoB3p1urs?<RRAyI1$-4Ggb z0YkSR2NpLQ-~y#KEE!v#qD`$vbiDqhw-j2cjHv#ft5nK^?qcI8gTFN(HSm*hti`sH9~AXcIE z?M-m3!VeujG{^2sg7Pwi++p22c} zW#ZATCl;%Bd4JcxcCVk}jfjQtWZotxTn&z0e?;pGVjD~84zIj^L=BpbT?;;cGOw|y zDEjDv%O2XWZS=TwEq3%!srvpZt;V^P1rc=Eg6b{!Dwm3gX{ z6`Pt~JZX7^%9>)LUwAsx_OGk*A!1RXV)qXm{(LE5~N9R2*fio-MPYSTiFp0V%~b z*Kb~1?UyZ0ZVxLwil*BYcDKbE&y3R{KRKn0Or5$M~(B5zM0+(3Ti z_h26|O>NJ;BL@c~r*vO&P*zw|<>;1*0FUnt7p)u}wkjogSg6t8Ogba;-yJ~V!Oj(Z zuT5t*yawZusP9Z#{|kCd|F7Zyo%Da0cH!LHV>o&sk@T{fSg1!UDyhs>neuoYxcs-) zOZ&3gX`~*mWEO`0GOJ{eJnF`sY%o5er;wJppGVCIZ&x%6{J zLG^ysbFS3rWP@abrWlyZ4FHc7{4ptVK5%SPyh_*kjPPr`gMSDW{iv;9UFbQ9?v+CE zr;w2mWRn;zm%)%?!$#+y)wPTn)+r5E-lwPWpp} z2$+ht6KOZ*ym!k^kUsNm{55V+BK>yQkMoUdfduwdfJSX>v>P*5dHs`BLjSq_+;w$T7GF z;_@zdT=+i3(=seTE%!@tpccueix4%b8Kza?Tpz?4hnqL1&DqS^Cn`o_CqJ#7`}WUTPJpMVR$aBnorF zgmT;>hu3M^_1rZIS`?ZPp}Li$_H!G4^+q!xuVsmiQW(XE6-?Hh^NqUpU_yhxvGbPyIG0M%m->q}WfH9Kd!H`k#X~b9g zL2&sFx7`}NYr;pl2X zh~XKEetoR)BIG9O9G6o(@FEnvuVeub(VG`Ik!1}IUt#m*X>dJZS4{NRQ4=une zevrsb(~?Xj;K%Q{zjY7TAxzVE#Ky--M-9egFpXRtkFGfn#@u_ApQ4VtLcsnCLv>Y(05Js~7xqi{?_N(I7oXBUsoH##;`B%LzV$Qf zR__&5jAlJq+r@qRyke#>G&xIj`q$o6US93`WZweu7dDi;fa`WO#j4*q{es~lZ~5p# z`Ni@UeBroHb?rqrP~IrfU8HwEgG1m5NUm3rsc7*a>tZ!1V&|3;1KM=VSztzI4{1OhA7>t6uRLY|ars=XAOf+t)n%bX z8YrC1zQ+;Im2w1KHrXXVfD(C&O;d`>=2s5dw$GYy%crpShhm!?)X%j~zjy{GVOS~M zVtSVpRJ89Z^j0qZQ=t)_m$;>39{AaOYsGL+mmb6;882RctnW85b@DQUG5h1{+vCHQ zco*UdbI|Z36NOh;Dc1PEl6LveJ~{M;t>_yNDuDS{Cyobe6hkDdiY2RmKef!a^3ajv z->c0L1Ix`XZ*{TSDnWhGJu^Q8WX&C(+P9cT`6EPIPmKS0!`uDT+L+V)1LSv$Pk4KX z&mHrt50bxN`%f#HQW7jbFILqh>y55jjCyaiEM}+oURs6d;BK1ZKat?1B_cSmJ(8Ew zJhYZy0}e)085+2>1sg5(H_q3T6D058pyc2pHE8J#wd9oF{7?k)QrC)cCj^f>3XG<3 zp>F|1%C4isCOo(w(KPXnSY7RN)466xaH@PoDp6dH-GUi+`toO~V#*Sb9!Gy+wk2kP z5m&pTW%QkUozs0yG4-Eaez|c4R#V=ZeUml6YP6&e{V==rbeCRz4BpRYdgRoosW=o} zy?!;%v3cmy1^FfvRyN2Pw67Y;r;zG@-NOZr^WI*cTIO- zDs)H}P(WBt_Eu5ZcoXJMZAgh|v3L1i{2uM8R9KJ7!O~;-%e@Oejw}#bHH@lsr!3(4 z_>q+M7`qgPi3^#9r$(3xGlf%UQchT71FSc{En-LubEuow)0=kP^M}WFt~{iXM~Hlk z5~{3t_GeR*C8c|F@8X{|xyK69#jsaeTC!(q$B8Robzgd<2;>bCbTQPBJe>E|?8zMf zS7A}3bqWvHzkB(wzV((xtfInEQzf4%a~G~T9iqf0M{qM*+&PfFVC_Z#DHkqakz$bA zmXAcrtN=`WvQ(g-OaXJdygR{REl#sEpLa2FPRS`_B^3h?olrde1%6A;M6{(M9qWA3 zv~k`Ob^8cwJV5K;t+^ak&Ue-r15viy#$<_*i`NJfv-(T2!YBRFYwO-$LCvp-J&ibK*pO1dbrGn&^BCp_Q~MKZ!{g6m?pb zbepEVd7!rujz6_Gep$v~hf??l@x`fAjR40Fiim5~=8I~P!@J!YU$BOq@0})|*uc~T zQj>)-GGZ=b5)*T;YOuM1U$M)e$Z2l*oq54!#?R8ZG58uAtmw@SdgBq zQ;-G+eT0#nN%f7wY$0gw*lHgAo4;=6bM3+B?}dI~9Nu36 zBne7oqunE~YF`^fl0R0D9{ozYlhOnIDM1OwF`kc@2YsMB_097Snh1yOQP=5}n9;sZ z@7C^!9JlKYE2_^>b)QF*-GUlxzodOrKLI%gIZ>sqEozwA~G-FZQ*B(yHq&+;5frIp9zU^x)tKmjqy->de zRsTU6G&vECr;_Ob0z-leMib%e*zels*QptJSrROG%H|i)R z9519vutkKLK;3;v1><+~awThBUa-E5xtANb9_MPKqhmdmRCyp1uCivx}4G=;BQf`y@j^JG2_=2LYbl z{lWq>SPQAqy$4S|X3zgR!hQ=;_XzRNR*m>tN_FFT7@~{6gPYhvaG-iGiwzE|EHxZi0WVfN%(34e7VS1O86C2`f?*+YQ%A68+oi1H3 zNzrMKg#^iwqW#KXE_J- zz7ka7`&GLWiVB0YA)!tOO|b>c~!Nh1et$a)Llwz!0vB}S`+sGEhWSFIY~Z!Pm@ z9D`S0U`y~VY~w!L*TXVCI^;axq#SFq_c z{f!m+g%L6ovNPqA%k8YE*&n}k7z$Ggbbo9*q7f?eYWo;JzmqaD5~Kb99vEQ1l?;Ub zKxDU~_)sP#EXMDMF@L)^$h6R>B!l50LLxfWb3W3>QlsmPG4l8O=Xl7;>MVv|~nG^BC z*6m}jD(PM~r>)cH)@f#jORdS!%?C|c`+w3bEy?|&2)AYR@x>KT{*pySX(^1wbm^g% zNS72tk?0&^TM2MvcDs0=X54ErCNSe0R@ii z1W%yQIW*%4mC6E{**(7<9x>J%BGrljIKW6@A@|VYSy8ZxmYcn;IE7pa06vVTqrTV zD09r#><-oil0~Ah^SW)tjOM_${_J+TKY&lf4o|P%wXOcuF!#{wrhVR#YUXeK_Xs1> zKCL+f4BL*??hfJpcY~~`0A;DI_S4H#o0Q(8|s;>zJWMB zi&hF(Y%%M5_+LMAy2JK>wpml#F=_%Zs>hQdjo%8}!{MsiZ;Jxd;ICj>jBX9-Op<=W z^wahZGHlaiK+|D|v#OM-pD%%m3OG=Ps8{H{6w+`*f@uatMk1pa`ZrBcI81jng7TeaAckAO1v`bs!g?Ow zwZ5sI9dXnQYR^3$Nrz-)Fd~~!%J@+Rr7Pr{_xO2RW;O~UuX+V`*h|elKAqjXZNeWZ45Uobtb7Feicg{ zMu^efaAxFrl1CLj>CA4wGDeu_?TP3D25UVAU`d2sL~D0jlyhU6tS5@K8mvvJQSSpj zdZMgyYe;?%FG=Pdj2V<;c_vc9Go%4nAi?gf+9yA4@P1yJYLy!caw5aR7+%6HS(Q$t z+jqN@u+KTD5QJEAD{&F?o?yJ{(iBs2Ae5)J9z@4`aiV)JKdrr@-xFy$b{Wr|(r$rk zR`{jceuK4-4Hx=9eKr-QOs+YoJvc~(EK+9xa8kU^P?2*%w3II!VV604v)e7Tk7(I) gtRct@s(!@D&_zbn;Y7f%s1Ay3IvKg#`fvXK17f_o$^ZZW literal 0 HcmV?d00001 diff --git a/man/sir_gt_pmf.Rd b/man/sir_gt_pmf.Rd new file mode 100644 index 0000000..8df3041 --- /dev/null +++ b/man/sir_gt_pmf.Rd @@ -0,0 +1,30 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{sir_gt_pmf} +\alias{sir_gt_pmf} +\title{Generation interval corresponding to the sample \code{stochastic_sir_rt} dataset} +\format{ +\code{sir_gt_pmf} A numeric vector of length 26 that sums to one within +numerical tolerance +} +\usage{ +sir_gt_pmf +} +\description{ +Gostic et al., 2020 simulates data from a stochastic SEIR model. Residence +time in both the E and the I compartments is exponentially distributed, with +a mean of 4 days (or a rate/inverse-scale of 1/4). These residence times +imply a gamma-distributed generation time distribution with a shape of 2 and +a rate of 1/4. The distribution can be regenerated in +\code{data-raw/sir_gt_pmf.R}. +} +\details{ +From this parametric specification, we produce a double-censored, +left-truncated probability mass function of the generation interval +distribution. We produce the PMF using +\code{\link[primarycensoreddist:dprimarycensoreddist]{primarycensoreddist::dpcens()}} with version 0.4.0. See +https://doi.org/10.1101/2024.01.12.24301247 for more information on +double-censoring biases and corrections. +} +\keyword{datasets} diff --git a/man/stochastic_sir_rt.Rd b/man/stochastic_sir_rt.Rd new file mode 100644 index 0000000..0f49cfb --- /dev/null +++ b/man/stochastic_sir_rt.Rd @@ -0,0 +1,64 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{stochastic_sir_rt} +\alias{stochastic_sir_rt} +\title{Synthetic dataset of stochastic SIR system with known Rt} +\format{ +\code{stochastic_sir_rt} A data frame with 301 rows and 12 columns: +\describe{ +\item{time}{Timestep of the discrete-time stochastic SEIR simulation} +\item{date}{Added from the original Gostic, 2020 dataset. A date +corresponding to the assigned \code{time}. Arbitrarily starts on January 1st, +2023.} +\item{S, E, I, R}{The realized state of the stochastic SEIR system} +\item{dS, dEI, DIR}{The stochastic transition between compartments} +\item{incidence}{The true incidence in the \code{I} compartment at time t} +\item{obs_cases}{The observed number of cases at time t from +forward-convolved incidence.} +\item{obs_incidence}{Added from the original Gostic, 2020 dataset. The +\code{incidence} column with added negative-binomial observation noise. +Created with \code{set.seed(123456)} and the call +\verb{rnbinom(299, mu = [["incidence"]], size = 10)} Useful for +testing.} +\item{true_r0}{The initial R0 of the system (i.e., 2)} +\item{true_rt}{The known, true Rt of the epidemic system} +} +} +\source{ +\url{https://github.com/cobeylab/Rt_estimation/tree/d9d8977ba8492ac1a3b8287d2f470b313bfb9f1d} # nolint +\url{https://github.com/CDCgov/cfa-epinow2-pipeline/pull/17} # nolint +} +\usage{ +stochastic_sir_rt +} +\description{ +A dataset from Gostic, Katelyn M., et al. "Practical considerations for +measuring the effective reproductive number, Rt." PLoS Computational Biology +16.12 (2020): e1008409. The data are simulated from a stochastic SEIR +compartmental model. +} +\details{ +This synthetic dataset has a number of desirable properties: +\enumerate{ +\item The force of infection changes depending on the Rt, allowing for sudden +changes in the Rt. This allows for modeling of sudden changes in infection +dynamics, which might otherwise be difficult to capture. Rt estimation +framework +\item The realized Rt is known at each timepoint +\item The dataset incorporates a simple generation interval and a reporting +delay. +} + +Gostic et al. benchmark the performance of a number of Rt estimation +frameworks, providing practical guidance on how to use this dataset to +evaluate Rt estimates. + +In practice, we've found that the amount of observation noise in the +incidence and/or observed cases is often undesirably low for testing. Many +empirical datasets are much noisier. As a result, models built with these +settings in mind can perform poorly on this dataset or fail to converge. We +manually add observation noise with \code{rnbinom(299, mu = stochastic_sir_rt[["obs_cases"]], size = 10)} and the random seed 123456 and +store it in the \code{obs_incidence} column. +} +\keyword{datasets}