From 486c14d43b9f8093173ecc03645c25ebe90406dc Mon Sep 17 00:00:00 2001 From: Zachary Susswein <46581799+zsusswein@users.noreply.github.com> Date: Mon, 26 Aug 2024 14:46:46 -0400 Subject: [PATCH] Document package scope in README (#11) * Document package scope * Bump NEWS * Apply suggestions from code review Co-authored-by: Adam Howes * Define wrappers * Add explicit references to intended package functionality * Add @athowes clarifying suggestion * Add specificity on pipeline contribution * %s/ EpiNow2 / `{EpiNow2}` * Tweak metadata description --------- Co-authored-by: Adam Howes --- NEWS.md | 1 + README.md | 35 ++++++++++++++++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/NEWS.md b/NEWS.md index 2c1159dc..250c6f66 100644 --- a/NEWS.md +++ b/NEWS.md @@ -4,3 +4,4 @@ * CI running on Ubuntu only & working pkgdown deploy to Github Pages * Initial R package with checks running in CI * Updated DESCRIPTION and added guidelines for package authorship +* Set up README with explanation of purpose and scope diff --git a/README.md b/README.md index c1b78e01..5539c84a 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,41 @@ -# CFA EpiNow2 Pipeline +# CFA `{EpiNow2}` Pipeline ## Overview A lightweight wrapper around [{EpiNow2}](https://github.com/epiforecasts/EpiNow2) to add functionality for deployment in Azure Batch. +It holds some helper functions to interface with Azure services, convert input data to EpiNow2's expected input format, and save expected outputs. +It also adds metadata and logging. + +This package is meant to enhance the `{EpiNow2}` package to support deployment in CFA's computational environment. +The code is open source as part of CFA's goals around development, but it may not be possible to support extensions to additional environments. + +## Structure + +This repository holds an R package, `{CFAEpiNow2Pipeline}`. +The repository is structured as a standard R package. +All PRs pass R CMD check as part of the CI suite as a pre-condition for merge to main. +If interested in contributing see `CONTRIBUTING.md` and open an issue or a PR. + +The package contains contains some adapters and wrappers to run to run many independent `{EpiNow2}` models in parallel with cloud resources. +The adapters read from datasets with standardized formats and produces outputs as flat files with standard names. +The wrapper functions enhance `{EpiNow2}` functionality to support cloud deployments, adding more logging and standardizing the R environment. + +This package standardizes the interface to `{EpiNow2}` for purposes of deployment in a pipeline as part of a suite of models. +This package does _not_ manage pipeline deployment or kickoff, data extraction and transformation, or model output visualization. + +## Components + +This package implements functions for: + +1. **Configuration**: Loads parameters such as prior distributions, generation intervals, and right-truncation from a config in a standard schema, with the path to this config passed at runtime. + - The config is validated at runtime, but config generation is specified at pipeline runtime and not part of this package. +1. **Data load**: Loads data from the CFA data lake or from a local environment and translates it from CFA's schema to the expected `{EpiNow2}` format. + - Paths are specified via the config +1. **Parameters**: Loads pre-specified and -validated generation interval, delay interval, and right-truncation distributions from from the CFA data lake or from a local environment and formats them for use in EpiNow2. +1. **Model run**: Manages R environment to run `{EpiNow2}` from a fixed random seed, both for `{EpiNow2}` initialization and Stan sampling. +1. **Outputs**: Provides functionality to process `{EpiNow2}` model fits to a standardised flat output format (as described in forthcoming link). Within the pipeline, model fits are saved both in their entirety as `.rds` files, as well as via this flat output format. +1. **Logging**: Steps in the pipeline have comprehensive R-style logging, with the the [cli](https://github.com/r-lib/cli) package +1. **Metadata**: Extract comprehensive metadata on the model run and store alongside outputs ## Project Admin