Inpatient Claim Simulation and Fraud Detection

This is the repository for the "Simulation and Detection of Healthcare Fraud in German Inpatient Claims Data" paper submitted to ICCS 2024 in the Health Thematic Track.

Description

This project contains two parts, Claims Simulation and Fraud Detection.

The Simulator generates German inpatient claims according to the regulations valid in 2021. Based on this data, claims are changed in a fraudulent way.

The fraud types included are:

Increases in ventilation hours
Changing vaginal births to cesarean sections
Decreasing the weight of newborns
Adding the need for personal care to a newborn's treatment
Releasing people too early from hospital (bloody release)
Change the order of ICD codes

Factors not simulated:

no inpatient ward
the outcome of a treatment (cured, death, etc.) is not simulated
vacations during long hospital stays are not simulated
the reason for admissions is not simulated

The Detection uses the generated data to train models. Tested algorithms (from Scikit-Learn):

The models with the best results are Gradient Boosting and Random Forest.

Visuals

Claims Simulation

1. Start Simulation: Patients and Hospitals

2. Initialize Treatment: Get ICD- and OPS-Codes, ventilation, duration

3. Adjust Treatments: to coding guidelines

4. Inject Fraud: following the fraud patterns

5. Finishing up: adjusting the fraudulent claims to coding guidelines and calculating claims

More visualizations and UML diagrams can be found in the directory doc.

Installation

Download this repository
Install requirements with pip:

pip install -r requirements.txt

Install a DRG-Grouper (here the grouper from IMC Clinicon is used (https://www.imc-clinicon.de/tools/imc-navigator/index_ger.html))
Adjust config_template.py to your requirements and save it as config.py

IMPORTANT: This project is built and tested with Python 3.9!

Usage

Generation

After installing the code and adjusting the config_template.py as described in Installation

In case you want to use another DRG-Grouper, you need to modify grouper_wrapping.py accordingly.

If everything is set up, execute from the project's root directory:

python simulation/simulate.py

Make sure, you configured your config.py correctly.

If everything works, several .csv-files are generated in the directory data/generated data:

claims.csv: initial inpatient treatments, not containing fraud, DRGs, and claims
claims_with_fraud.csv: claims.csv with injected fraud
claims_with_drg.csv: claims_with_fraud.csv after grouping the treatments
claims_final.csv: final inpatient treatments

Detection

First preprocess your data according to preprocessing.py. Then select your classifier by commenting everything else (if you want to train all in one run, do not change anything). To train the models execute

python detection/classifying.py

The models trained are saved in the directory models.

Data

The simulated data used for training the machine learning algorithms can be accessed at zenodo.org

Support

In case questions occur, contact me or create an issue.

Roadmap

This code is not maintained anymore. Further necessary developments:

Improve the OPS-Code generation
Model the treatment outcome
Simulate inpatient ward (via simulating outpatient treatment)
etc.

Authors and acknowledgment

Special thanks to my supervisors René Raab, Kai Klede and Prof. Dr. Bjoern Eskofier.

Furthermore, thanks to AOK Bayern and Dominik Schirmer for providing the necessary validation data.

Thanks to IMC Clinicon and Gunter Damian for giving me access to IMC Navigator, a certified DRG Grouper.

Project status

Until further notice, the development of this project stopped after 29.11.2023. Feel free to contact me (see Support), if you have ideas and use cases for collaboration.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
detection		detection
doc/svg		doc/svg
simulation		simulation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inpatient Claim Simulation and Fraud Detection

Description

Visuals

Claims Simulation

Installation

Usage

Generation

Detection

Data

Support

Roadmap

Authors and acknowledgment

Project status

About

Releases

Packages

Contributors 2

Languages

License

mad-lab-fau/inpatient-claims-simulator

Folders and files

Latest commit

History

Repository files navigation

Inpatient Claim Simulation and Fraud Detection

Description

Visuals

Claims Simulation

Installation

Usage

Generation

Detection

Data

Support

Roadmap

Authors and acknowledgment

Project status

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages