Authors: Lianna Hovhannisyan, John Lee, Vadim Taskaev, Vanessa Yuen
The British Columbia Center for Disease Control (BCCDC) manages a range of provincial programs and clinics that contribute to public health and help control the spread of disease in BC. It administers and distributes the latest daily data on COVID-19 in British Columbia, which it provides in csv format along case-, lab- and regional-specific features as well as in comprehensive ArcGIS format via the COVID-19 webpage (under "Download the data"). This package leverages daily case-specific COVID-19 data, allowing users to conveniently download the latest case data, and - per specified date range interval - compute several key statistics, visualize time series progression along age-related and regional parameters, and generate exploratory data analysis in the form of histogram figures supporting on-demand analysis. COVID-19 case detail parameters extracted using this package:
- Reported_Date (in YYYY-MM-DD format)
- HA (provincial health region, e.g., "Vancouver Coast Health")
- Sex (M or F)
- Age_Group (reported along 10-yr age group bins, e.g., "60-69")
- Classification_Reported (diagnosis origin, e.g., "Lab-diagnosed")
bccovideda
can be installed from PyPI using the following terminal command:
$ pip install bccovideda
-
get_data()
- This function downloads the latest detailed daily case-specific COVID-19 from BCCDC's dedicated COVID-19 homepage. It returns a dataframe containing the extracted raw data.
-
show_summary_stat()
- This function computes summary statistics from the available case-specific parameters, such as age-related and regional aggregate metrics. It returns a dataframe listing key identified summary statistics specified per the time interval queried.
-
plot_line_by_date()
- This function returns a line chart plot of daily case counts, based on parameters and grouping selected by the user, per the time interval queried.
-
plot_hist_by_cond()
- This function returns a histogram plot based on parameters and grouping selected by the user, per the time interval queried, allowing for on-demand exploratory data analysis.
bccovideda
can be used to download and compute summary statistics, generate exploratory data analysis histogram plots, and plot time series chart data as follows:
from bccovideda.get_data import get_data
from bccovideda.show_summary_stat import show_summary_stat
from bccovideda.plot_hist_by_cond import plot_hist_by_cond
from bccovideda.plot_line_by_date import plot_line_by_date
get_data()
show_summary_stat("2022-01-01", "2022-01-13")
plot_hist_by_cond("2021-01-01", "2021-01-30", "Age")
plot_line_by_date("2021-01-01", "2021-01-30")
Given the relatively adequate accessibility of latest aggregate COVID-19 data combined with its persistent impact on socio-economics since early 2020, there are a number of rather comprehensive Python packages that perform similar data extract and exploratory data analysis functions, such as covid, covid19pyclient, covid19pandas. In contrast to existing packages, bccovideda
provides a simple user interface that focuses on the localized provincial context of British Columbia, utilizing features specific to BCCDC's data administration conventions for generating a quick overview and on-demand analysis of trends and statistics pertaining to age-related and regional case characteristics.
-
Python 3.9 and Python packages:
- pandas==1.3.5
- requests==2.27.1
- altair==4.2.0
- altair-saver==0.5.0
Documentation bccovideda
can be found at Read the Docs
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Group 25 Contributors:
- Lianna Hovhannisyan: @liannah
- John Lee: @johnwslee
- Vadim Taskaev: @vtaskaev1
- Vanessa Yuen: @imtvwy
The bccovideda
project was created by DSCI 524 (Collaborative Software Development) Group 25 within the Master of Data Science program at the University of British Columbia (2021-2022). It is licensed under the terms of the MIT license.
bccovideda
was created with cookiecutter
and the py-pkgs-cookiecutter
template.