Skip to content

Overview of OMOP to MEDS Transformations

Michael Wornow edited this page Oct 17, 2024 · 1 revision

This page was accurate as of 2024.10.17 (MEDS 0.3.3, MEDS_ETL 0.3.8).

The purpose of this page is to provide a brief overview of all the data transformations that happen when converting OMOP => MEDS

Source Files

Transformations (applied in order)

1. Drop all except for 11 OMOP tables

  • Link to Code
  • Effect: We drop all OMOP tables except for these 11 tables: person, drug_exposure, visit_occurrence, condition_occurrence, death, procedure_occurrence, device_exposure, measurement, observation, note, visit_detail

2. Set time property of events

  • Link to Code
  • Effect: If event is from person table, then use birth_datetime see here. Otherwise, try these column names in order: ["_start_datetime", "_start_date", "_datetime", "_date"] and force datetimes without a time to have a time of 23:59:59 (i.e. move to midnight) see here.

3. Set code property of events

  • Link to Code
  • Effect: For the person and death tables, we force the code to be MEDS_BIRTH and MEDS_DEATH, respectively, per the schema definition here. Otherwise, we try assigning the code to be the source concept ID (column with suffix _source_concept_id). If it's not available, then we use the concept ID (column with suffix _concept_id). If that's not available, then we us a fallback concept ID (defined as 8 for events from the visit_occurrence table, 46235038 for events from the note table, and 4203722 for events from the visit_detail table. We use the following base column names for each table (to which we append the suffixes _source_concept_id, _concept_id) per here:
    • drug_exposure => drug_concept_id
    • visit => visit_concept_id
    • condition_occurrence => condition_concept_id
    • procedure_occurrence => procedure_concept_id
    • device_exposure => device_concept_id
    • measurement => measurement_concept_id
    • observation => observation_concept_id
    • note => note_class_concept_id

4. Replace all concept IDs with concept codes

  • Link to Code
  • Effect: Replaces concept IDs with concept codes per the concept table.

5. Set value property of events

  • Link to Code
  • Effect: All events default to value = None. If the event is from the measurement table, then try the following columns in order (stopping once we hit a non-null value): value_as_number as a number, value_source_value as a string, and value_as_concept_id as a concept ID see here. If the event is from the observation table, then try the following columns in order (stopping once we hit a non-null value): value_as_number as a number, value_as_string as a string, and value_as_concept_id as a concept ID see here. If the event is from the note table, then try only the column note_text as a string see here

6. Set metadata property of events

  • Link to Code
  • Effect: Adds the following columns:
    • table => name of OMOP source table
    • visit_id (Optional) => value of visit_occurrence_id column (if exists)
    • unit (Optional) => value of unit_source_value column if exists, otherwise value of unit_concept_id column if exists
    • clarity_table (Optional) => value of load_table_id column (if exists)
    • note_id (Optional) => value of note_id column (if exists)
    • end (Optional) => value of {table_name}_end_datetime column (if exists)

7. Drop all events without a code

  • Link to Code
  • Effect: Drops all events that have code = None

Note: This is all triggered by running meds_etl_omop [OMOP_SOURCE_DIR] [MEDS_OUTPUT_DIR]