-
Notifications
You must be signed in to change notification settings - Fork 0
Data Resources & Project Ideas
Rini-Veeravalli edited this page Jun 24, 2021
·
1 revision
You can use whatever software, languages, or technology (Python, R, D3, wearable, Raspberry Pi, pen and paper).
It is each team's own responsibility to find their own data, or you can use our suggested databases and project ideas.
Any data used must be publicly available.
Examples of public and open data sources that may be used during the Hackathon:
- 2019 Coronavirus dataset (January - February 2020).
- https://www.ncbi.nlm.nih.gov/genbank/sars-cov-2-seqs/
- https://github.com/CSSEGISandData/COVID-19
- https://openprescribing.net/
- https://cloud.google.com/blog/products/data-analytics/free-public-datasets-for-covid19
- Daily Google Covid Trends Daily Google Trends in cities in several countries across the world, for certain days, during the coronavirus period
- https://github.com/GoogleCloudPlatform/covid-19-open-data
- https://data.england.nhs.uk/
- https://coronavirus.data.gov.uk/details/download
- https://www.cqc.org.uk/about-us/transparency/using-cqc-data
- https://fingertips.phe.org.uk/profile/atlas-of-variation
- MIMIC-III (Medical Information Mart for Intensive Care III) is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012
- Oxford COVID-19 Government Response Tracker (OxCGRT) OxCGRT collects publicly available information on 17 indicators of government responses.
- YouGov covid-19 Polling data Polling on governments handling of covid
- nCov19 Patient level data for Chinese patients with covid
- ONS covid data These are all the government published data on covid, contains carehome deaths, price increases, death by race / social income
- UK Data Service UK social, economic, population data
- data.gov.uk
- Open Data Institute (ODI) certified datasets
- European Union Open Data portal
- Office for National Statistics
- https://zenodo.org/
- https://www.re3data.org/
- https://yoda.yale.edu/
- https://figshare.com/
- https://datadryad.org/
- https://www.kaggle.com/datasets
- https://data.mendeley.com/
- Google Trends
- ImageNet Image database
- Prediction model for ICU using MIMIC-III data
- Data mining to collect disease (phenotype and signs & symptoms) data from rarediseases.org or raredisease.info.nih.gov text, and produce an output that automatically updates as the site information changes over time
- Compare most searched Google keywords across the world during (a particular time period in) COVID. Maybe compare a specific country's/city's trends with current affairs?
- Correlation between covid symptoms and comorbidities/clustering using nCov19 data
- Analysis of effective government policies using YouGov polling data
- Covid death by demographic/economic impact using ONS data
- Bias in health data - making tool to help/visualise bias in datasets, how your cohort changes (representability) when you change a value.
To re-iterate, your team's project output can be in any format.
We'd like to thank our code club members and organisers for suggesting these example datasets and project ideas