Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modelling workflow #61

Closed
ben18785 opened this issue Jun 16, 2023 · 6 comments
Closed

Modelling workflow #61

ben18785 opened this issue Jun 16, 2023 · 6 comments
Assignees
Labels

Comments

@ben18785
Copy link
Collaborator

At the moment, a user does the following to fit their model:

  1. They prepare their data into the format required by the prepare_serodata function
  2. They run prepare_serodata to get data in a form required by the modelling (essentially some additional columns are added to the dataset)
  3. They run run_seromodel to fit the model.

I'd suggest that users don't really need / want to see prepare_serodata, so they'd pass the raw data direct to run_seromodel.

@zmcucunuba
Copy link
Member

Hi Ben, I think prepare_serodata is helpful for the user to provide a step to think about the data, perhaps accompanied by warnings or errors indicating if there are any issues with the data before running the models.

@ben18785
Copy link
Collaborator Author

Hi @zmcucunuba -- thanks. I agree that users certainly need warnings if their data aren't in the right form. But (sorry, playing Devil's advocate), couldn't that just be done when they do: run_seromodel?

Is there another reason a user would want to have the object returned by prepare_serodata?

@zmcucunuba
Copy link
Member

zmcucunuba commented Jun 16, 2023

Haha, I guess you're right @ben18785! Perhaps It's just me being extremely step-by-step-oriented.

@ekamau
Copy link
Collaborator

ekamau commented Jun 21, 2023

Just to add - what is the bare minimum information required for the different models to run, the minimum that should be supplied in the user input data?
Then the functions like run_seromodel could indicate that ...
in most instances, one needs at a minimum: age, years of survey, number_seropositive, number_tested. I could be wrong!

But I guess it also depends how much user interaction the workflow requires, or how complex the models are in which case the user gives more information..

@ben18785
Copy link
Collaborator Author

@ntorresd is going to look at allowing the run_seromodel to include a step to optionally run the models without the preprocessing step.

@ntorresd
Copy link
Member

Closed by #200. From v1.0.1 on, the only preprocessing needed for modelling is to add the age group marker age_group, which is built from age_min and age_max whenever it's missing in the survey. See the discussions in #191 and #193 for further details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants