Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an article about fitting models with seroreversion #205

Open
ntorresd opened this issue Sep 9, 2024 · 0 comments
Open

Add an article about fitting models with seroreversion #205

ntorresd opened this issue Sep 9, 2024 · 0 comments

Comments

@ntorresd
Copy link
Member

ntorresd commented Sep 9, 2024

In #200 @ben18785 suggested we should add an article about fitting models with seroreversion using the example wrote in the PR description:

To illustrate the current pipeline, consider a disease with constant FOI λ = 0.02 and seroreversion rate μ = 0.01 :

foi <- data.frame(
  year = 1951:2000,
  foi = rep(0.01, 50)
)
seroreversion_rate <- 0.01

and a serological survey with the following features:

survey_features <- data.frame(
  age_min = 1:50,
  age_max = 1:50,
  sample_size = runif(0,100)
)

$ age_min     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,…
$ age_max     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,…
$ sample_size <dbl> 51, 35, 14, 86, 70, 34, 50, 98, 51, 93, 78, 25, 36, 40, 15, 43, 79, 96, 96, 64, 56, 31, 84, 45

We can use simulate_serosurvey(), introduced in #199, to simulate statistically consistent seropositive counts n_seropositive for each age group, according to our set of serological models. In this case, I use the time-varying model:

serosurvey <- simulate_serosurvey(
  model = "time",
  foi = foi,
  survey_features = survey_features,
  seroreversion_rate = seroreversion_rate
) %>% mutate(
  survey_year = max(foi$year)
)

$ age_min        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, …
$ age_max        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, …
$ sample_size    <dbl> 51, 35, 14, 86, 70, 34, 50, 98, 51, 93, 78, 25, 36, 40, 15, 43, 79, 96, 96, 64, 56, 31, 84,…
$ n_seropositive <int> 4, 1, 2, 12, 14, 2, 9, 19, 12, 27, 18, 6, 14, 12, 5, 19, 28, 31, 45, 26, 25, 13, 37, 30, 42$ survey_year     <int> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2

which we can visualise using plot_serosurvey:

plot_serosurvey(serosurvey = serosurvey)

image

Implementing the constant model with and without seroreversion by means of fit_seromodel():

# initialization function for sampling
init <- function() {
  list(foi_vector = rep(0.1, nrow(survey_features)))
}

# constant model without seroreversion
seromodel_constant_no_serorev <- fit_seromodel(
  serosurvey = serosurvey,
  model = "constant",
  foi_prior = sf_uniform(0, 10),
  init = init
)

# constant model with seroreversion
seromodel_constant_serorev <- fit_seromodel(
  serosurvey = serosurvey,
  model = "constant",
  foi_prior = sf_uniform(0, 10),
  is_seroreversion = TRUE,
  seroreversion_prior = sf_uniform(0, 1),
  init = init
)

Visualizing the results:

plot_no_serorev <- plot_seromodel(
  seromodel = seromodel_constant_no_serorev,
  serosurvey = serosurvey
)

plot_serorev <- plot_seromodel(
  seromodel = seromodel_constant_serorev,
  serosurvey = serosurvey
)

cowplot::plot_grid(plot_no_serorev, plot_serorev)

image

Note that the model estimating the seroreversion rate (right panel in the image), no only is a better fit for this serological survey according to the elpd value, but it also accurately estimates both the FOI and seroreversion rate we used to simulate the serosurvey.

Another difference with previous versions lies in how we visualize the estimated parameters. Since the constant model is estimating a single FOI value, we only show the estimated value with its corresponding credible interval (we use to plot it instead). However, for the time and age varying models we estimate several values of the FOI in the time/age span of the survey, which we visualize graphically. Take for instance the time varying model with seroreversion:

# implementing the time-varying model with seroreversion
seromodel_time_serorev <- fit_seromodel(
  serosurvey = serosurvey,
  model = "time",
  foi_prior = sf_uniform(0, 10),
  is_seroreversion = TRUE,
  seroreversion_prior = sf_uniform(0, 1),
  init = init,
  iter = 4000
)
# plotting
plot_time_serorev <- plot_seromodel(
  seromodel = seromodel_time_serorev,
  serosurvey = serosurvey
)

image

Here we plot the estimated values of the FOI as a function of time (blue line and shadow) and the R-hat estimates for each estimated value (black dots). Note that, even though we ran the model for a large number of iterations - 4000, the model did not converged (since there are R-hat values over 1.01). We can improve the convergence of the model by estimating less values of the FOI. To specify the time intervals for which FOI values will be estimated, we index them by means of get_foi_index():

foi_index <- get_foi_index(
  serosurvey = serosurvey,
  group_size = 5
  )

> foi_index
 [1]  1  1  1  1  1  2  2  2  2  2  3  3  3  3  3  4  4  4  4  4  5  5  5  5  5  6  6  6  6  6  7  7  7  7  7  8  8
[38]  8  8  8  9  9  9  9  9 10 10 10 10 10

This function returns a list of indexes that labels each year (or age, for age-varying models). The idea is that a single FOI value will be estimated for each index, meaning that 10 FOI values will be estimated in this particular case:

init <- function() {
  list(foi_vector = rep(0.1, max(foi_index)))
}

seromodel_time_serorev <- fit_seromodel(
  serosurvey = serosurvey,
  model = "time",
  foi_prior = sf_uniform(0, 10),
  is_seroreversion = TRUE,
  seroreversion_prior = sf_uniform(0, 1),
  foi_index = foi_index,
  init = init,
  iter = 4000
)

plot_seromodel(
  seromodel_time_serorev,
  serosurvey = serosurvey
)

image

Now the model converges. As long as foi_index length covers the whole time/age span of the serological survey, less regular indexations foi_index can be specified.

This example shows how to use the seroreversion rate estimation features of the package using a simulated dataset as benchmark, which allows to validate the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant