Skip to content

Data Resources & Project Ideas

Rini-Veeravalli edited this page Jun 24, 2021 · 1 revision

Resources

You can use whatever software, languages, or technology (Python, R, D3, wearable, Raspberry Pi, pen and paper).
It is each team's own responsibility to find their own data, or you can use our suggested databases and project ideas.

Data Sources

Any data used must be publicly available.
Examples of public and open data sources that may be used during the Hackathon:

Health datasets

General datasets

Example Project Ideas

  • Prediction model for ICU using MIMIC-III data
  • Data mining to collect disease (phenotype and signs & symptoms) data from rarediseases.org or raredisease.info.nih.gov text, and produce an output that automatically updates as the site information changes over time
  • Compare most searched Google keywords across the world during (a particular time period in) COVID. Maybe compare a specific country's/city's trends with current affairs?
  • Correlation between covid symptoms and comorbidities/clustering using nCov19 data
  • Analysis of effective government policies using YouGov polling data
  • Covid death by demographic/economic impact using ONS data
  • Bias in health data - making tool to help/visualise bias in datasets, how your cohort changes (representability) when you change a value.

To re-iterate, your team's project output can be in any format.

Acknowledgements

We'd like to thank our code club members and organisers for suggesting these example datasets and project ideas

Clone this wiki locally