Skip to content

Commit

Permalink
Merge branch 'main' into pre-commit-ci-update-config
Browse files Browse the repository at this point in the history
  • Loading branch information
eroell authored Oct 9, 2024
2 parents 1776398 + bdba66f commit 27c3b64
Show file tree
Hide file tree
Showing 14 changed files with 587 additions and 87 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ __pycache__/

# Tests and coverage
/data/
ehrapy_data/
/node_modules/

# docs
Expand Down
13 changes: 11 additions & 2 deletions docs/api.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
# API

## EHRData

```{eval-rst}
.. module:: ehrdata
.. autosummary::
:toctree: generated
EHRData
```

## Input-Output

```{eval-rst}
Expand All @@ -10,11 +21,9 @@
:toctree: generated
io.omop.load
io.omop.extract_tables
io.omop.extract_person
io.omop.extract_observation_period
io.omop.extract_measurement
io.omop.time_interval_table
io.omop.extract_observation
io.omop.extract_procedure_occurrence
io.omop.extract_specimen
Expand Down
13 changes: 7 additions & 6 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,9 @@

intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"anndata": ("https://anndata.readthedocs.io/en/stable/", None),
"scanpy": ("https://scanpy.readthedocs.io/en/stable/", None),
"numpy": ("https://numpy.org/doc/stable/", None),
"anndata": ("https://anndata.readthedocs.io/en/stable", None),
"scanpy": ("https://scanpy.readthedocs.io/en/stable", None),
"numpy": ("https://numpy.org/doc/stable", None),
}

# List of patterns, relative to source directory, that match files and
Expand Down Expand Up @@ -124,8 +124,9 @@

pygments_style = "default"

# If building the documentation fails because of a missing link that is outside your control,
# you can add an exception to this list:
nitpick_ignore = [
# If building the documentation fails because of a missing link that is outside your control,
# you can add an exception to this list.
# ("py:class", "igraph.Graph"),
# https://github.com/duckdb/duckdb-web/issues/3806
("py:class", "duckdb.duckdb.DuckDBPyConnection"),
]
36 changes: 18 additions & 18 deletions docs/notebooks/omop_tables_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Introduction to Vocabularies\n",
"## Introduction to Vocabularies\n",
"\n",
"\n",
"We explain these along the first three tables of the OMOP CDM.\n",
"\n",
"#### 0.1 Concept\n",
"### 0.1 Concept\n",
"\n",
"Purpose: Clinical events in OMOP are expressed as concepts, the fundamental building block of data records. For this, OMOP gathers concepts from many existing vocabularies, such as WHO's [ICD10](https://www.icd-code.de/) and [SNOMED](https://www.snomed.org/). There are many concepts in the OMOP CDM; the concepts that are actually used for a specific dataset are listed in this table of the database.\n",
"\n",
Expand Down Expand Up @@ -185,7 +185,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 0.2 Concept Relationship\n",
"### 0.2 Concept Relationship\n",
"Any two concepts can have a relationship between each other. The most common two relationships are \"Maps to\" and \"Maps from\", where a non-standard concept from the source database is mapped to a standard concept in the CDM."
]
},
Expand Down Expand Up @@ -278,15 +278,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 0.3 Concept Ancestry\n",
"### 0.3 Concept Ancestry\n",
"(is built automatically from the concept relationship table if there are is a relationships. Not sure if should include..?)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Internal Reference Tables\n",
"### Internal Reference Tables\n",
"There are tables DOMAIN, VOCABULARY, CONCEPT_CLASS, RELATIONSHIP; these tables duplicate the fields already in CONCEPT and CONCEPT_RELATIONSHIP, and can provide more information with an additional *_NAME field.\n",
"\n",
"We here omit them, as they can at any stage be created from the latter two tables."
Expand All @@ -296,7 +296,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1. Person\n",
"### 1. Person\n",
"\n",
"- Purpose: Contains demographic information about each patient.\n",
"- Key Fields: person_id, gender_concept_id, year_of_birth, race_concept_id, ethnicity_concept_id\n",
Expand Down Expand Up @@ -405,7 +405,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2. Observation Period\n",
"### 2. Observation Period\n",
"Purpose: Defines periods of time during which the patient’s data is considered reliable and available.\n",
"\n",
"OMOP CDM: \"This table contains records which define spans of time during which two conditions are expected to hold: (i) Clinical Events that happened to the Person are recorded in the Event tables, and (ii) absence of records indicate such Events did not occur during this span of time.\"\n",
Expand Down Expand Up @@ -483,7 +483,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3. Visit Occurrence\n",
"### 3. Visit Occurrence\n",
"\n",
"\"This table contains Events where Persons engage with the healthcare system for a duration of time. They are often also called “Encounters”. Visits are defined by a configuration of circumstances under which they occur, such as (i) whether the patient comes to a healthcare institution, the other way around, or the interaction is remote, (ii) whether and what kind of trained medical staff is delivering the service during the Visit, and (iii) whether the Visit is transient or for a longer period involving a stay in bed.\"\n",
"\n",
Expand Down Expand Up @@ -623,7 +623,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4. Visit Detail (OPTIONAL)\n",
"### 4. Visit Detail (OPTIONAL)\n",
"- Purpose: More details on visit, such as movement between units in an inpatient stay. There can be 0 or more entries in visit_detail per entry in visit_occurrence.\n",
"- Key Fields: visit_detail_id, person_id, visit_detail_concept_id, visit_detail_start_date, visit_detail_end_date\n"
]
Expand Down Expand Up @@ -905,7 +905,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 5. Drug Exposure\n",
"### 5. Drug Exposure\n",
"\n",
"\"The purpose of records in this table is to indicate an exposure to a certain drug as best as possible. In this context a drug is defined as an active ingredient. Drug Exposures are defined by Concepts from the Drug domain, which form a complex hierarchy. As a result, one DRUG_SOURCE_CONCEPT_ID may map to multiple standard concept ids if it is a combination product.\"\n",
"\n",
Expand Down Expand Up @@ -1059,7 +1059,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 6. Procedure Occurrence\n",
"### 6. Procedure Occurrence\n",
"\n",
"\"This table contains records of activities or processes ordered by, or carried out by, a healthcare provider on the patient with a diagnostic or therapeutic purpose.\"\n",
"\n",
Expand Down Expand Up @@ -1184,7 +1184,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 7. Device Exposure\n",
"### 7. Device Exposure\n",
"\"The Device domain captures information about a person’s exposure to a foreign physical object or instrument which is used for diagnostic or therapeutic purposes through a mechanism beyond chemical action. Devices include implantable objects (e.g. pacemakers, stents, artificial joints), medical equipment and supplies (e.g. bandages, crutches, syringes), other instruments used in medical procedures (e.g. sutures, defibrillators) and material used in clinical care (e.g. adhesives, body material, dental material, surgical material).\"\n",
"\n",
"- Key Fields: device_exposure_id, person_id, device_concept_id, device_exposure_start_date, device_concept_type_id"
Expand Down Expand Up @@ -1313,7 +1313,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 7. Measurement\n",
"### 7. Measurement\n",
"\n",
"- Purpose: Captures clinical measurements or laboratory test results.\n",
"- Key Fields: measurement_id, person_id, measurement_concept_id, measurement_date, value_as_number"
Expand Down Expand Up @@ -1465,7 +1465,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 8. Observation\n",
"### 8. Observation\n",
"\n",
"\"Observations differ from Measurements in that they do not require a standardized test or some other activity to generate clinical fact. Typical observations are medical history, family history, the stated need for certain treatment, social circumstances, lifestyle choices, healthcare utilization patterns, etc. If the generation clinical facts requires a standardized testing such as lab testing or imaging and leads to a standardized result, the data item is recorded in the MEASUREMENT table.\"\n",
"\n",
Expand Down Expand Up @@ -1605,7 +1605,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 9. Death\n",
"### 9. Death\n",
"- Purpose: Captures information related to patient death.\n",
"- Key Fields: person_id, death_date, death_type_concept_id, cause_concept_id"
]
Expand Down Expand Up @@ -1697,7 +1697,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 10. Note\n",
"### 10. Note\n",
"- Purpose: Contains unstructured clinical notes.\n",
"- Key Fields: note_id, person_id, note_date, note_text"
]
Expand Down Expand Up @@ -1768,7 +1768,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 13. Note_NLP\n",
"### 13. Note_NLP\n",
"- Purpose: Encodes all output of NLP on clinical notes. Each row represents a single extracted term from a note.\n",
"- Key Fields: note_nlp_id, note_id, lexical_variant, note_nlp_concept_id"
]
Expand Down Expand Up @@ -1839,7 +1839,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 14. Specimen\n",
"### 14. Specimen\n",
"The specimen domain contains the records identifying biological samples from a person.\n",
"\n",
"- Purpose:\n",
Expand Down
6 changes: 3 additions & 3 deletions docs/notebooks/tutorial_ehrdata_omop.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -626,7 +626,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Interlude - Irregularly sampled time series data\n",
"#### Interlude - Irregularly sampled time series data\n",
"Electronic health records can be regarded as (that is, form a model of a person via) irregular sampling irregularly sampled time series.\n",
"\n",
"Following notation and explanation from [Horn et al.](https://proceedings.mlr.press/v119/horn20a.html), a time series of a patient can be described as a set of tuples (t, z, m), where t denotes the time, z the observed value, and m a modality description of the measurement.\n",
Expand Down Expand Up @@ -1114,7 +1114,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "ehrapy_venv_july",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -1128,7 +1128,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
"version": "3.11.7"
}
},
"nbformat": 4,
Expand Down
6 changes: 4 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ name = "ehrdata"
version = "0.0.1"
description = "A Python package for EHR data"
readme = "README.md"
license = { file = "LICENSE" }
license = "Apache-2.0"
maintainers = [
{ name = "Eljas Roellin", email = "[email protected]" },
]
Expand Down Expand Up @@ -36,6 +36,7 @@ optional-dependencies.dev = [
]
optional-dependencies.doc = [
"docutils>=0.8,!=0.18.*,!=0.19.*",
"ehrdata[lamin]",
"ipykernel",
"ipython",
"myst-nb>=1.1",
Expand All @@ -57,6 +58,7 @@ optional-dependencies.lamin = [
"bionty",
"lamindb",
"omop",
"rich",
]
optional-dependencies.test = [
"coverage",
Expand All @@ -75,7 +77,7 @@ installer = "uv"
features = [ "dev" ]

[tool.hatch.envs.docs]
extra-features = [ "doc" ]
features = [ "doc" ]
scripts.build = "sphinx-build -M html docs docs/_build {args}"
scripts.open = "python -m webbrowser -t docs/_build/html/index.html"
scripts.clean = "git clean -fdX -- {args:docs}"
Expand Down
3 changes: 2 additions & 1 deletion src/ehrdata/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
from importlib.metadata import version

from . import dt, io, pl, pp, tl
from .core import EHRData

__all__ = ["dt", "io", "pl", "pp", "tl"]
__all__ = ["EHRData", "dt", "io", "pl", "pp", "tl"]

__version__ = version("ehrdata")
1 change: 1 addition & 0 deletions src/ehrdata/core/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .ehrdata import EHRData
1 change: 1 addition & 0 deletions src/ehrdata/core/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
R_LAYER_KEY = "r_layer"
Loading

0 comments on commit 27c3b64

Please sign in to comment.