Documentation updates #35

EthanSteinberg · 2024-07-30T13:29:56Z

No description provided.

EthanSteinberg · 2024-07-30T13:31:09Z

README.md

+  2. A _measurement_ or _patient measurement_ or _observation_ in a MEDS dataset refers to a single measurable
+     quantity observed about the patient during their care. These observations can take on many forms, such as
+     observing a diagnostic code being applied to the patient, observing a patient's admission or transfer
+     from one unit to another, observing a laboratory test result, but always correspond to a single


I don't think the measurement abstraction exists in MEDS anymore since we don't have nesting.

These are just "events" now. "event" is now the term for a single measurable quantity observed about a patient during their care.

This section should be rewritten to refer to "event"

I don't have an objection to using "event" instead of "measurement", but I want there to be some term to refer to all observations that occur at a single unique timestamp. This is useful in several contexts, even when we don't nest, such as:

Providing a per-patient integral index of the ordered unique timestamps in their sequence that can be used during tabularization (meds tab uses this), for matching to label cohorts (meds-tab and meds-torch use this), or efficiently subselecting tensorized data over a certain time range (the nested ragged tensorization code uses this).

Differentiating pre-processing transformations that affect the observations within a timepoint but do not affect the timepoints of the data from those that do affect the timepoints of the data.

Differentiating models that operate over sequences of observations/measurements vs. those that operate over sequences of unique timestamps.

To me, the natural division is to use the term "event" to occur to all things at a unique timestamp and "measurement" or "observation" to refer to a single tuple of patient_id, timestamp, code, and *_value, but I'm open to using different verbiage, as long as we can express both concepts. Do you have an alternate nomenclature you'd prefer?

I don't think MEDS needs to define that term, or use it, since the schema/datamodel doesn't require referring to unique timestamps.

I'm not sure what I would call this abstraction since I have never used it (and don't really see a case where I would).

README.md

EthanSteinberg · 2024-07-30T13:32:11Z

README.md

+  4. An _event_ or _patient event_ in a MEDS dataset corresponds to all observations about a patient that
+     occur at a unique timestamp (within the level of temporal granularity in the MEDS dataset).


This abstraction no longer exists and this sentence should be deleted.

EthanSteinberg · 2024-07-30T13:35:03Z

I think this mostly looks good, the main issue is that it uses the double nested terminology of measurements vs events, when that is gone now. There are only "events" (single measurments) now.

mmcdermott · 2024-07-30T17:21:13Z

In that case are you comfortable just defining the measurement or observation term, not defining an event abstraction, but also not defining the term "event" in any explicit manner? That way libraries that want a further abstraction can use the term as desired without conflict?

…

On Tue, Jul 30, 2024, 10:57 AM Ethan Steinberg ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In README.md <#35 (comment)> : > + 2. A _measurement_ or _patient measurement_ or _observation_ in a MEDS dataset refers to a single measurable + quantity observed about the patient during their care. These observations can take on many forms, such as + observing a diagnostic code being applied to the patient, observing a patient's admission or transfer + from one unit to another, observing a laboratory test result, but always correspond to a single I don't think MEDS needs to define that term, or use it, since the schema/datamodel doesn't require referring to unique timestamps. I'm not sure what I would call this abstraction since I have never used it (and don't really see a case where I would). — Reply to this email directly, view it on GitHub <#35 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADS5XYF7FCXDVRL3BD5IEDZO6SVTAVCNFSM6AAAAABLWMNFJ6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDEMBXHE2DMOJUGU> . You are receiving this because you commented.Message ID: ***@***.***>

EthanSteinberg · 2024-07-30T17:23:27Z

I think it would be very weird if the Medical Event Data Standard did not use the word "event".

So I would prefer to keep using the name "event".

mmcdermott · 2024-07-30T17:26:15Z

Keep in mind that in most other usage, "event stream" data has the connotation that events have unique time points, which factors into my consideration here. Of course, that is merely my impression, so I could be wrong, but that is how I have seen the term used previously

…

On Tue, Jul 30, 2024, 1:23 PM Ethan Steinberg ***@***.***> wrote: I think it would be very weird if the Medical Event Data Standard did not use the word "event". So I would prefer to keep using the name "event". — Reply to this email directly, view it on GitHub <#35 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADS5XYLJROP5ELU6UF43HDZO7D2JAVCNFSM6AAAAABLWMNFJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJYHA2DOMRYGI> . You are receiving this because you commented.Message ID: ***@***.***>

EthanSteinberg · 2024-07-30T17:33:40Z

Keep in mind that in most other usage, "event stream" data has the connotation that events have unique time points

I've never seen that usage before, can you provide some examples?

mmcdermott · 2024-07-30T18:10:52Z

I've never seen that usage before, can you provide some examples?

hmm. In looking through some references, while I find a lot of settings where it is implicitly clear that the mental model is one of unique timestamps (as many sites reference event streams as capturing real-time data, with events occurring in real time and being received in the stream as they happen, or as most diagrams of event streams show unique timestamps, etc.), I don't see anything explicitly claiming that they have unique timestamps, and when I asked ChatGPT (as a proxy for the "consensus" of the field), it states that there is not such a mandate for uniqueness, though this is attributed to limitations in the granularity of the time-unit being recorded, such as in high-frequency settings, not due to our setting where many observations are implicitly binned before being recorded.

So, I guess that argument falls flat. I would still put in a marked preference for us using, in so far as anything is defined here, the term "measurement" or "observation", both to reserve a natural term to refer to the concept that is in use in several settings already as I outlined above and because it more strongly agrees with my conception of these things. Then, if you want to use "event" in your code and terminology, you are free to do so, as we won't standardize that term at all and nothing says you have to use "measurement" or "observation". Do you have a strong objection to this?

mmcdermott · 2024-07-30T18:11:54Z

Alternatively, I'm also happy banishing both terms from any official terminology and just referring to the main schema as "meds_schema" or something like this.

…ed file path instructions.

EthanSteinberg · 2024-07-30T18:35:41Z

This looks good. Feel free to merge it in

mmcdermott added 2 commits July 29, 2024 23:55

Started updating README

3ed25ab

Updating to mandatory file formats.

ed9cb91

EthanSteinberg commented Jul 30, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

EthanSteinberg commented Jul 30, 2024

View reviewed changes

mmcdermott added 3 commits July 30, 2024 14:14

Removed controversial or unneeded terms

5985629

Updated schemas and documentation with consensus terms and deduplicat…

1da2ec0

…ed file path instructions.

Removed unneeded python object format.

e10c22d

mmcdermott mentioned this pull request Jul 30, 2024

MEDS 0.3 Release Candidate #32

Merged

mmcdermott marked this pull request as ready for review July 30, 2024 18:39

mmcdermott merged commit e0cd043 into ethan-meds-3-rc Jul 30, 2024
3 checks passed

mmcdermott deleted the mmd-meds-3-rc-documentation branch July 30, 2024 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation updates #35

Documentation updates #35

EthanSteinberg commented Jul 30, 2024

EthanSteinberg Jul 30, 2024

mmcdermott Jul 30, 2024

EthanSteinberg Jul 30, 2024

EthanSteinberg Jul 30, 2024

EthanSteinberg commented Jul 30, 2024

mmcdermott commented Jul 30, 2024 via email

EthanSteinberg commented Jul 30, 2024

mmcdermott commented Jul 30, 2024 via email

EthanSteinberg commented Jul 30, 2024

mmcdermott commented Jul 30, 2024

mmcdermott commented Jul 30, 2024

EthanSteinberg commented Jul 30, 2024

		4. An _event_ or _patient event_ in a MEDS dataset corresponds to all observations about a patient that
		occur at a unique timestamp (within the level of temporal granularity in the MEDS dataset).

Documentation updates #35

Documentation updates #35

Conversation

EthanSteinberg commented Jul 30, 2024

EthanSteinberg Jul 30, 2024

Choose a reason for hiding this comment

mmcdermott Jul 30, 2024

Choose a reason for hiding this comment

EthanSteinberg Jul 30, 2024

Choose a reason for hiding this comment

EthanSteinberg Jul 30, 2024

Choose a reason for hiding this comment

EthanSteinberg commented Jul 30, 2024

mmcdermott commented Jul 30, 2024 via email

EthanSteinberg commented Jul 30, 2024

mmcdermott commented Jul 30, 2024 via email

EthanSteinberg commented Jul 30, 2024

mmcdermott commented Jul 30, 2024

mmcdermott commented Jul 30, 2024

EthanSteinberg commented Jul 30, 2024