Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 754 Bytes

README.md

File metadata and controls

8 lines (5 loc) · 754 Bytes

Undergrad research @ GT

I worked on 2 different scripts: gp_ICD10_vectors.py and Tree_to_hierarchy.py which automate some preprocessing steps from data in the UK Biobank.

The Tree_to_hierarchy.py script takes a data coding schema used in the UK Biobank and determines all the parent codes given a ICD10 code and creates a file containing this information. This script was based off assumptions for datacoding-19 but is generalizable to other UK Biobank datacoding schema.

The gp_ICD10_vectors.py script went to record level of general practitioner (gp) data and determines the first diagnosis age of diagnosis for patients and then adds this data into a matrix (numpy arrays).

I also worked on creating the custom dataloaders for our PyTorch model.