Early Warning Tool for prioritizing individuals for CKD screenings (based on risk of CKD)

Formulation

For every patient

who has had a visit in the past x years (using x=2 for now and defining been seen as having had a clinical visit)
has not been diagnosed with CKD yet (using diagnosis for now but extend to medications and abnormal egfrs)
and has not had an eGFR in the past y months

Predict the top k individuals (based on intervention capacity) who are risk of having an abnormal eGFR in the next z months

Analysis to be done

Predict risk of CKD stage 3 or above in the next 12 months (we can vary this)
Baselines:
1. current practice
2. clinical guidelines
3. CDC adopted screening tool
Metric: Precision (PPV) at top k (:warning: need to determine k based on capacity)
Fairness metric: TPR disparity by Race, Gender, SES, access, etc.

Methodology

Define Cohort based on formulation
Define Outcome/Label based on formulation (will get diagnosed with X in the next z months)
Define Training and Validation sets over time
Define and generate predictors
Train Models on each training set and score all patients in the corresponding validation set
Evaluate all models for each validation time according to metric (PPV at top k)
Select "Best" model based on results over time
Explore the model to understand who it flags, how they compare to the cohort, important predictors
Check and/or correct for bias issues

Triage background

We are using Triage to build and select models. Some background and tutorials on Triage:

Tutorial on Google Colab - Are you completely new to Triage? Run through a quick tutorial hosted on google colab (no setup necessary) to see what triage can do!
Dirty Duck Tutorial - Want a more in-depth walk through of triage's functionality and concepts? Go through the dirty duck tutorial here with sample data
QuickStart Guide - Try Triage out with your own project and data
Suggested workflow
Understanding the configuration file

Running models and triage

Assuming Triage is installed and the data is in a postgres database. To run,

activate virtual environment source env/bin/activate
python run.py -c configfilename

Choices to Make

replace flag (set to false until we want to nuke everything)
save predictions (don't for the beginning)
number of processors to use

Config files, Model Selection, and Bias Analysis

cohort:All

cohort: all patients who've had a visit in the past 2 years and do not have CKD yet
label: will get diagnosed with CKD in the next 12 months
config file, notebook with model selection

cohort:No previous abnormnal eGFRs

cohort: all patients who've who've had a visit in the past 2 years, do not have CKD yet, and no previous abnormal eGFRs
label: will get diagnosed with CKD in the next 12 months
config file, notebook with model selection

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
triage_config_files		triage_config_files
.gitignore		.gitignore
README.md		README.md
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Early Warning Tool for prioritizing individuals for CKD screenings (based on risk of CKD)

Formulation

Analysis to be done

Methodology

Triage background

Running models and triage

Config files, Model Selection, and Bias Analysis

About

Releases

Packages

Languages

dssg/ckdwarning

Folders and files

Latest commit

History

Repository files navigation

Early Warning Tool for prioritizing individuals for CKD screenings (based on risk of CKD)

Formulation

Analysis to be done

Methodology

Triage background

Running models and triage

Config files, Model Selection, and Bias Analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages