- towards interpretable, reusable, reproducible phenotype computation in clinical data science.
- a project funded by Health Data Research UK: Graph-Based Data Federation for Healthcare Data Science
Extracting patient phenotypes from routinely collected health data (such as Electronic Health Records) requires translating clinically-sound phenotype definitions into queries/computations executable on the underlying data sources by clinical researchers. This requires significant knowledge and skills to deal with heterogeneous and often imperfect data. Translations are time-consuming, error-prone and, most importantly, hard to share and reproduce across different settings. This project implements a knowledge driven phenotyping framework that
- decouples the specification of phenotype semantics from underlying data sources;
- can automatically populate and conduct phenotype computations on heterogeneous data spaces.
This framework has been deployed on five Scottish health datasets.