Here you will find all the different scripts and tools that we use to generate the data.
As there are several metrics being reported, each with its independent pipeline, the overall data pipeline can seem a bit complex. Therefore, this file attempts to explain the most relevant processes in use as we believe that transparency is a must and that it can help developers in contributing to the project.
Currently, we are trying to have all diferent pipelines in our cowidev
library.
Folder | Description |
---|---|
grapher |
Contains output files that power our grapher visualizations |
input |
External files used to compute derived metrics, such as X-per capita, and aggregate groups, such as 'Asia', etc. |
notebook |
Notebooks used for development purposes (not maintained). |
src |
cowidev library. It contains the code for almost all project's pipelines. |
scripts |
Legacy folder. Contains some parts of the code, such as the COVID-19 testing collection scripts. The code is a mixture of R and Python scripts. |
Note that the folder public/data is not to be modified, as it contains output files generated by this pipeline. Exceptions may include output folder refactor and others.
📁 Find it at
scripts/vaccinations/
It is pobably the most mature and complex process. It has a submodule in the cowidev
library that provides all the
tools to run all country importers.
More info:
📁 Find it at
scripts/testing/
It resembles very much the architecture of the vaccination pipeline, but differs in some key points. The most noticeable difference is that it contains both R and python code. We currently prefer contributions in Python.
More info:
📁 Find it at
cowidev.excess_mortality
Collects excess mortality data and exports it to human-readable format.
More info:
Run it:
python -m cowidev.excess_mortality etl
📁 Find it at
cowidev.yougov
Run it:
python -m cowidev.yougov
📁 Find them at
scripts/
Old README content can be found at README-old.md