Medical Event Data Standard

This organization contains GitHub Repositories for the Medical Event Data Standard (MEDS), a simple dataset schema for machine learning over electronic health record (EHR) data. Unlike existing tools, pipelines, or common data models, MEDS is a minimal standard designed for maximum interoperability across datasets, existing tools, and model architectures. By providing a simple standardization layer between datasets and model-specific code, MEDS can help make machine learning research for EHR data dramatically more reproducible, robust, computationally performant, and collaborative. Alongside this report, we also release several existing integrations with models, datasets, and tools, and will work actively with the community going forward for further adoption and use. See our draft proposal for more details, and please leave comments or questions via github issues to help us improve this effort!

Software Ecosystem

Project	Type	Documentation URL	Repository URL	Paper URL	Description
Core MEDS	Core	GitHub	GitHub	OpenReview	A data standard and community for building and sharing EHR machine learning tools
MEDS-Reader	Package	Docs	GitHub	arXiv	An optimized Python package for efficient EHR data processing achieving 10-100x improvements in memory, speed, and disk usage
MEDS-Transforms	Package		GitHub		A set of functions and scripts for extraction to and transformation/pre-processing of MEDS-formatted data.
MEDS-Tab	Package	Docs	GitHub		A library designed for automated tabularization, data preparation with aggregations and time windowing.
ACES	Package	Docs	GitHub	arXiv	A package and configuration language for reproducible extraction of task cohorts for machine learning over event-stream datasets
MEDS-Torch	Package	Docs	GitHub		Advancing healthcare machine learning through flexible, robust, and scalable sequence modeling tools.
MEDS-Evaluation	Package		GitHub		Evaluation pipeline for MEDS.
MEDS-ETL	Package		GitHub		Efficient ETL that supports OMOP, MIMIC, eICU, PyHealth.
FEMR	Package		GitHub		A Python package for manipulating longitudinal EHR data for machine learning, with a focus on supporting the creation of foundation models and verifying their presumed benefits in healthcare.
MEDS-DEV	Benchmark		GitHub		A benchmark for evaluating the performance of machine learning models on MEDS-formatted data.

Pretrained Models

CLMBR-T-base: https://huggingface.co/StanfordShahLab/clmbr-t-base

Datasets / Benchmarks

EHRSHOT: https://ehrshot.stanford.edu

Coming Soon...

Tools that are planned to be compatible with MEDS:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Medical Event Data Standard

Medical Event Data Standard

Software Ecosystem

Pretrained Models

Datasets / Benchmarks

Coming Soon...

Popular repositories Loading

Repositories

People

Top languages

Most used topics