R functions for estimating excess mortality, using ARIMA models.
- forecast: Used for the ARIMA models.
- ggplot2: Used to plot the results.
estimate_monthly_excess()
: Function for estimating monthly excess.estimate_weekly_excess()
: Function for estimating weekly excess.
yy
: The name of the variable of interest, indata
.forecast_window
: The length of the forecast window. This is the total length of the time period of interest. In other words, if performing analysis for pandemic-related excess mortality, theforecast_window
is the duration of pandemic time of interest.forecast_start
: The start date of the forecast_forecast_periods
: Optional. A vector that has unique values for each time period of interest. The length of the vector should be equal toforecast_window
.data_start
: The start date of the data, in a vector form. Consult the default value and the documentation forts()
for further information.data
: Thedata
object should be adata.frame
, with aDate
variable namedmonth
(if using monthly data) orweek
(if using weekly data). In the latter scenario, the expectation is thatweek
is the date of the last day of the week (a Saturday, using US convention). The other variables indata
should each represent the total number of deaths within some group of interest.
results_by_month
orresults_by_week
: The date-specific results.results
: The overall results, with summation over the entire time period of interest.simulations
: Simulated sums. These can be used to obtain a prediction interval for the sum over the time period of interest. The bounds inresults
(above) are derived from these simulations. However, the entirety of the simulations may be useful in some cases.plot
: A visualization of the results.
First, load your data
object:
dd<-readRDS('weekly data.rds')
We have named the object dd
since this is the default value of data
in the estimation functions.
To estimate excess mortality for the variable dpw.Inland Empire
:
rr<-estimate_weekly_excess('dpw.Inland Empire')
To view the plot:
rr$plot
A possible framework for analyzing multiple variables is as follows:
r1<-estimate_weekly_excess('dpw.Los Angeles County')
r2<-estimate_weekly_excess('dpw.San Francisco Bay Area')
r3<-estimate_weekly_excess('dpw.Inland Empire')
RR<-rbind(r1$results,r2$results,r3$results)
This stacks all of the results
from each analysis into a single data.frame
, named RR
.