-
Notifications
You must be signed in to change notification settings - Fork 24
ETL Instructions for Mapping ICDO to SNOMED
Eduard Korchmar edited this page Jan 8, 2020
·
6 revisions
Cancer diagnoses are usually represented by a combination of ICD-O-3 histology and topography codes. To map this combination to SNOMED follow these steps:
- Transform diagnosis SOURCE VALUE
- Histology code. In the source, it is normally formatted like this: 8140/3, where 8140 is histology type and 3 is tumor behavior. If histology type and behavior are stored separately, concatenate them to get single histology code.
- Topography code. In the source, it is normally formatted like this: C15.3. Be aware of the before the fourth character . If the source doesn't have the dot, insert it after the 3d character: C513 -> C15.3. If the source code contains only 3 characters, add '.9' to get code for "unspecified part of" subcathegory: C50 -> C50.9.
- Source value. Concatenate histology code and topography code using hyphen: 8140/3-C51.3. This value will be stored in the CONDITION_OCCURRENCE.CONDITION_SOURCE_VALUE field.
- Extract value of diagnosis SOURCE_CONCEPT_ID. Concept_ID for the combined histology/topography code is stored in the CONCEPT table. The following SQL shows how to extract its value for the above example:
The resulting value 44501519 will be stored in the CONDITION_OCCURRENCE.CONDITION_SOURCE_CONCEPT_ID field and will be used in mapping to a standard SNOMED code or itself (next section).
SELECT CONCEPT_ID FROM CONCEPT WHERE CONCEPT_CODE = ‘8140/3-C15.3’ --Adenocarcinoma, NOS, of upper third of esophagus AND VOCABULARY_ID = ‘ICDO3’
- Extract value of STANDARD CONCEPT ID
Source concept ID of the combined histology/topography code is mapped to itself or a standard concept ID in the CONCEPT_RELATIONSHIP table. The following SQL shows how to extract its value for the above example:
The resulting value [36715848] will be stored in the CONDITION_OCCURRENCE.CONDITION_CONCEPT_ID field and/or the EPISODE.EPISODE_OBJECT_CONCEPT_ID.
SELECT CONCEPT_ID_2 FROM CONCEPT_RELATIONSHIP WHERE CONCEPT_ID_1 = 44501519 AND RELATIONSHIP_ID = 'Maps to'
In some cases when the source data are incomplete, apply the following approach.
- Tumor behavior is not known Use 1 (uncertain behavior) to making your code complete: 8070 -> 8070/1
- Topography is unknown Use mappings from this file https://seer.cancer.gov/tools/conversion/ICD03toICD9CM-ICD10-ICD10CM.xls (last 3 tabs of this file) to obtain topography if you have ICD-10 code for this diagnosis. Note, if you have long ICD-10CM code, you need to cut it off to have only 5 symbols (including dot): C50.211 -> C50.2. In case when a patient has several cancer diagnoses, use ICD-10 from the date closest to the ICD-O histology date.
- Either Topography or Histology is unretrievable Use string value 'NULL' (spelled out) instead of the code. For instance, neoplasm of endometrium without specified morphology will have code 'NULL-C54.1', and neuronevus without specified site can be coded as '8725/0-NULL'. REFERENCES Information about ICDO3 vocabulary is here: http://www.iacr.com.fr/index.php?option=com_content&view=category&layout=blog&id=100&Itemid=577
Information about our approach to mapping is here: http://www.ohdsi.org/web/wiki/lib/exe/fetch.php?media=documentation:oncology:poster2018-improvement_of_cancer_diagnosis_representation_in_omop_cdm3_1_.pdf
Detailed information about ICDO3 implementation in OMOP CDM is here: https://www.ohdsi.org/web/wiki/doku.php?id=documentation:vocabulary:icdo3
Oncology Working Group Publications/Presentation
Data Model
- Cancer Models Representation
- EPISODE
- EPISODE_EVENT
- MEASUREMENT
- CONCEPT_NUMERIC
- Disease Episode Model
Vocabularies
OMOP Model
- Populating the OMOP Oncology Extension
- NAACCR Tumor Registry
- EHR and Claims