Add functions to simulate datasets to the seroprevalence_data module.… #83

ntorresd · 2023-07-04T21:20:16Z

This PR partly closes #57

fix: add exception to function plot_foi() to plot a FOI trend with different length along with the data for the case when their sizes don't coincide
clean test_visualisation
feature: add three functions to simulate datasets. get_sim_counts() generates a list with simulated counts by age following a binomial distribution. generate_sim_data() uses the counts generated by get_sim_counts() to create a dataframe with the necessary structure to use other functions of the package. group_sim_data() serves to group the previously generated dataset by age group; right now it groups the data by periods of 5 years.
add test_simulate_data to test the data simulation functions in the seroprevalence_data module

creating base line of package

…nto dev-feature-strmodel

…rofoi into dev-feature-strmodel-miguel

…ture

* arreglo carpetas R/stanmodels * Funciones finales pendientes a modificación * Delete Funciones eliminadas.docx * Delete Explicación del código del paquete SEROFOI.docx * Documentación en inglés y español * Documentación lista * modify package structure * modify package structure * modify names and folder test * Creation of general modules and revision of functions * refactoring visualization functions and define some general pck structure * update readme file * update readme file * general settings * modify dependecies, document and name functions Co-authored-by: Zulma M. Cucunubá <[email protected]> Co-authored-by: megamezl <[email protected]> Co-authored-by: megamezl <[email protected]>

…ction separation by model.

minor correction to function `get_age_group` documentation

* remove serodata .Rdata and .Rd files * remove R/serodata.R * doc: update functions documentation replacing for in examples * doc: update serofoi logo * doc: update README and vignettes This commit changes the removed preloaded dataset `serodata` for the identical `chagas2012`. * fix: minor correction to test_modelling

remove function `group_sim_data` from export and the corresponding example

Remove unnecessary parameters from `get_age_group`. Now the function takes an age vector as input rather than a dataframe containing an specific age column.

Simplify fit_seromodel() output and related refactorizations

Remove data writing from `test_simulate_data.R` as well as the corresponding .csv files.

Remove unused data paths from `test_simulate_data`

Add tests for `get_sim_probability` and `get_sim_n_seropositive`

Modify test for `generate_sim_data` and `group_sim_data` functions. Remove unnecessary model running. Remove `expect_doppleganger` tests.

ben18785

Thanks @ntorresd -- looking better. We still need much more extensive tests of the functions though as these functions are really key to lots of other functionality (including how we test that the Stan models are themselves working):

The get_sim_probability function needs value-based-testing. I.e. do the probabilities it outputs match theoretically derived ones?
Same for get_sim_data.
There are a variety of internal functions that also need checking, which I think are currently lacking unit tests.

R/seroprevalence_data.R

ben18785 · 2023-09-08T10:19:42Z

R/seroprevalence_data.R

+                          step = 5) {
+    age <- sim_data[[col_age]]
+    sim_data$age_group <-  get_age_group(age = age, step = step)
+    sim_data_grouped <- sim_data %>% group_by(age_group) %>%


Minor thing but can we move the group_by to the next line for consistency when using the pipe?

tests/testthat/test_get_sim.R

ben18785 · 2023-09-08T10:21:30Z

tests/testthat/test_sim_data.R

+    sim_data <- generate_sim_data(foi = foi_sim,
+                                  sample_size_by_age = sample_size_by_age,
+                                  tsur = 2050,
+                                  birth_year_min = 2000,
+                                  survey_label = 'foi_sim',
+                                  seed = seed)


I know this function is stochastic but we can still test it works by varying the FOIs and checking that in limits it behaves as we'd like it to.

Right now I'm relaying on model implementation to test this functionality. My idea is to make use of suitable models for different FOI trends. It is to be expected for a constant foi_simto be correctly approximated by the "constant" model for all times, so what I do is to make sure that this lies in the confidence interval obtained by implementing this model:

# Define constant FoI for simulations case_label <- "constant_foi_" foi_model <- "constant" foi_sim <- rep(0.02, tsur - birth_year_min) max_lambda <- 0.035 # Generate simulated data and run the "constant" model sim_data <- generate_sim_data(foi = foi_sim, sample_size_by_age = sample_size_by_age, tsur = tsur, birth_year_min = birth_year_min, survey_label = 'foi_sim', seed = seed) sim_seromodel <- run_seromodel(sim_data, foi_model = foi_model, n_iters = n_iters) # Check consistency between sim_foi and the fitted foi foi <- rstan::extract(sim_seromodel$seromodel_fit, "foi", inc_warmup = FALSE)[[1]] foi_lower <- apply(foi, 2, function(x) quantile(x, 0.05)) foi_upper <- apply(foi, 2, function(x) quantile(x, 0.95)) expect_true(all((foi_sim >= foi_lower) & (foi_sim <= foi_upper)))

I implemented this for a constant FOI and for the following smooth-decreasing FOI (red lines in the images):

case_label <- "smth_dec_foi_" #Smooth-decendent FoI foi_model = "tv_normal" foi_max = 0.2 stretch = 0.15 x <- 1:(tsur - birth_year_min) foi_sim <- (-foi_max * (atan(stretch * (x - 25))) / (0.5 * pi) + foi_max) / 2

I think it's worth to open a separate issue to discuss this in more detail, in case that you're not convinced by my approach.

We should be able to test whether data we obtain from simulating from these models is as expected without resorting to solving the inverse problem (as this leaves us liable to a number of issues, e.g. the FOIs aren't identifiable given the data). We have analytical results for all of the models which should allow us to test these directly. E.g. for a constant FOI model then we will know approximately the probability that someone aged x is seropositive; if we use a large sample size, we can check that the simulated proportion is near to that value.

* Enabling CMD check when doing pull request on `dev` * Removed tidyverse dependency

Add `get_sim_probability` to export to enable testing. Test the values of the probabilities.

The idea of the test is to make sure that the foi trend used to simulate the data is in the confidence interval of a suitable model. Add possibility to test for: - constant FoI - smooth-descendent FoI Minor clean up of `plot_foi`.

tracelac and others added 30 commits August 8, 2022 18:45

Initial commit

fee9be3

Adding package skeleton

d157dce

plot for logo

1665553

basic model

0f9a2a8

redme

7345609

comments

f048640

typo

c82f4ef

typo

5684116

Merge pull request #1 from TRACE-LAC/dev-feature-basicmodel

babb4f8

creating base line of package

Update README.md

de08085

arreglo carpetas R/stanmodels

21f6a6b

Funciones finales pendientes a modificación

226ad64

Delete Funciones eliminadas.docx

02395e3

Delete Explicación del código del paquete SEROFOI.docx

d373f1e

Documentación en inglés y español

8dde532

Merge branch 'dev-feature-strmodel' of github.com:TRACE-LAC/serofoi i…

43320fe

…nto dev-feature-strmodel

Documentación lista

f8295c4

modify package structure

8aee8eb

modify package structure

4749504

modify names and folder test

a74b3d3

Creation of general modules and revision of functions

4621e97

Merge branch 'dev-feature-strmodel-miguel' of github.com:TRACE-LAC/se…

3293b70

…rofoi into dev-feature-strmodel-miguel

refactoring visualization functions and define some general pck struc…

5522dba

…ture

update readme file

d67b2d4

update readme file

69b41d0

general settings

6f6d804

Merge branch 'dev' into dev-feature-strmodel-miguel

26be6ca

variable renaming, module names, programming syntax, and starting fun…

d8a0db0

…ction separation by model.

renaming variables, inputs and outputs

df169bd

doc: minor correction to function documentation

72ea70b

minor correction to function `get_age_group` documentation

ntorresd changed the base branch from main to dev August 22, 2023 22:54

ntorresd and others added 4 commits August 22, 2023 18:09

doc: remove function group_sim_data from export

77e7730

remove function `group_sim_data` from export and the corresponding example

refac: simplify function get_age_group

c2c8661

Remove unnecessary parameters from `get_age_group`. Now the function takes an age vector as input rather than a dataframe containing an specific age column.

Merge branch 'dev' into refac-fit_seromodel

6443a8c

ntorresd mentioned this pull request Aug 23, 2023

Add test suite #93

Closed

jpavlich and others added 5 commits August 24, 2023 10:09

Merge pull request #109 from epiverse-trace/refac-fit_seromodel

5154176

Simplify fit_seromodel() output and related refactorizations

test: remove data writing from test

e3341a6

Remove data writing from `test_simulate_data.R` as well as the corresponding .csv files.

clean: remove unused data paths from test

a336dc4

Remove unused data paths from `test_simulate_data`

test: add tests for get_sim_ functionalities

0676497

Add tests for `get_sim_probability` and `get_sim_n_seropositive`

test(refac): modify test for data sim functions

8d0ebba

Modify test for `generate_sim_data` and `group_sim_data` functions. Remove unnecessary model running. Remove `expect_doppleganger` tests.

ntorresd force-pushed the main-fix-simdata branch from 6eff129 to 8d0ebba Compare September 4, 2023 17:38

ntorresd requested a review from ben18785 September 4, 2023 17:41

Fixes #115 (#116)

308c0bd

ben18785 requested changes Sep 8, 2023

View reviewed changes

ntorresd requested review from ben18785 and removed request for ben18785 September 11, 2023 17:11

jpavlich and others added 6 commits September 15, 2023 03:52

82 tidyverse dependency (#120)

70e4a81

* Enabling CMD check when doing pull request on `dev` * Removed tidyverse dependency

Merge branch 'dev' into main-fix-simdata

7ef8f00

add test for values from get_sim_probability

2bc678d

Add `get_sim_probability` to export to enable testing. Test the values of the probabilities.

test: add statistical test for generate_sim_data

c1469dc

The idea of the test is to make sure that the foi trend used to simulate the data is in the confidence interval of a suitable model. Add possibility to test for: - constant FoI - smooth-descendent FoI Minor clean up of `plot_foi`.

add pracma to dependencies

1700b04

add parameter descriptions in generate_sim_data

e54abe7

ntorresd requested a review from ben18785 September 27, 2023 21:50

Bisaloo closed this Oct 10, 2023

Bisaloo force-pushed the dev branch from 47dae2b to cb19769 Compare October 10, 2023 16:15

ntorresd deleted the main-fix-simdata branch October 11, 2023 13:10

Bisaloo mentioned this pull request Oct 25, 2023

Add data simulation functions #126

Merged

ntorresd mentioned this pull request Jul 31, 2024

Simulation functions and vignette plus other things #199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add functions to simulate datasets to the seroprevalence_data module.… #83

Add functions to simulate datasets to the seroprevalence_data module.… #83

ntorresd commented Jul 4, 2023 •

edited

Loading

ben18785 left a comment

ben18785 Sep 8, 2023

ben18785 Sep 8, 2023

ntorresd Sep 27, 2023 •

edited

Loading

ben18785 Oct 3, 2023

Add functions to simulate datasets to the seroprevalence_data module.… #83

Add functions to simulate datasets to the seroprevalence_data module.… #83

Conversation

ntorresd commented Jul 4, 2023 • edited Loading

ben18785 left a comment

Choose a reason for hiding this comment

ben18785 Sep 8, 2023

Choose a reason for hiding this comment

ben18785 Sep 8, 2023

Choose a reason for hiding this comment

ntorresd Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

ben18785 Oct 3, 2023

Choose a reason for hiding this comment

ntorresd commented Jul 4, 2023 •

edited

Loading

ntorresd Sep 27, 2023 •

edited

Loading