-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation updates #35
Conversation
README.md
Outdated
2. A _measurement_ or _patient measurement_ or _observation_ in a MEDS dataset refers to a single measurable | ||
quantity observed about the patient during their care. These observations can take on many forms, such as | ||
observing a diagnostic code being applied to the patient, observing a patient's admission or transfer | ||
from one unit to another, observing a laboratory test result, but always correspond to a single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the measurement abstraction exists in MEDS anymore since we don't have nesting.
These are just "events" now. "event" is now the term for a single measurable quantity observed about a patient during their care.
This section should be rewritten to refer to "event"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have an objection to using "event" instead of "measurement", but I want there to be some term to refer to all observations that occur at a single unique timestamp. This is useful in several contexts, even when we don't nest, such as:
- Providing a per-patient integral index of the ordered unique timestamps in their sequence that can be used during tabularization (meds tab uses this), for matching to label cohorts (meds-tab and meds-torch use this), or efficiently subselecting tensorized data over a certain time range (the nested ragged tensorization code uses this).
- Differentiating pre-processing transformations that affect the observations within a timepoint but do not affect the timepoints of the data from those that do affect the timepoints of the data.
- Differentiating models that operate over sequences of observations/measurements vs. those that operate over sequences of unique timestamps.
To me, the natural division is to use the term "event" to occur to all things at a unique timestamp and "measurement" or "observation" to refer to a single tuple of patient_id
, timestamp
, code
, and *_value
, but I'm open to using different verbiage, as long as we can express both concepts. Do you have an alternate nomenclature you'd prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think MEDS needs to define that term, or use it, since the schema/datamodel doesn't require referring to unique timestamps.
I'm not sure what I would call this abstraction since I have never used it (and don't really see a case where I would).
README.md
Outdated
4. An _event_ or _patient event_ in a MEDS dataset corresponds to all observations about a patient that | ||
occur at a unique timestamp (within the level of temporal granularity in the MEDS dataset). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This abstraction no longer exists and this sentence should be deleted.
I think this mostly looks good, the main issue is that it uses the double nested terminology of measurements vs events, when that is gone now. There are only "events" (single measurments) now. |
In that case are you comfortable just defining the measurement or
observation term, not defining an event abstraction, but also not defining
the term "event" in any explicit manner? That way libraries that want a
further abstraction can use the term as desired without conflict?
…On Tue, Jul 30, 2024, 10:57 AM Ethan Steinberg ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In README.md
<#35 (comment)>
:
> + 2. A _measurement_ or _patient measurement_ or _observation_ in a MEDS dataset refers to a single measurable
+ quantity observed about the patient during their care. These observations can take on many forms, such as
+ observing a diagnostic code being applied to the patient, observing a patient's admission or transfer
+ from one unit to another, observing a laboratory test result, but always correspond to a single
I don't think MEDS needs to define that term, or use it, since the
schema/datamodel doesn't require referring to unique timestamps.
I'm not sure what I would call this abstraction since I have never used it
(and don't really see a case where I would).
—
Reply to this email directly, view it on GitHub
<#35 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADS5XYF7FCXDVRL3BD5IEDZO6SVTAVCNFSM6AAAAABLWMNFJ6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDEMBXHE2DMOJUGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I think it would be very weird if the Medical Event Data Standard did not use the word "event". So I would prefer to keep using the name "event". |
Keep in mind that in most other usage, "event stream" data has the
connotation that events have unique time points, which factors into my
consideration here. Of course, that is merely my impression, so I could be
wrong, but that is how I have seen the term used previously
…On Tue, Jul 30, 2024, 1:23 PM Ethan Steinberg ***@***.***> wrote:
I think it would be very weird if the Medical Event Data Standard did not
use the word "event".
So I would prefer to keep using the name "event".
—
Reply to this email directly, view it on GitHub
<#35 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADS5XYLJROP5ELU6UF43HDZO7D2JAVCNFSM6AAAAABLWMNFJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJYHA2DOMRYGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I've never seen that usage before, can you provide some examples? |
hmm. In looking through some references, while I find a lot of settings where it is implicitly clear that the mental model is one of unique timestamps (as many sites reference event streams as capturing real-time data, with events occurring in real time and being received in the stream as they happen, or as most diagrams of event streams show unique timestamps, etc.), I don't see anything explicitly claiming that they have unique timestamps, and when I asked ChatGPT (as a proxy for the "consensus" of the field), it states that there is not such a mandate for uniqueness, though this is attributed to limitations in the granularity of the time-unit being recorded, such as in high-frequency settings, not due to our setting where many observations are implicitly binned before being recorded. So, I guess that argument falls flat. I would still put in a marked preference for us using, in so far as anything is defined here, the term "measurement" or "observation", both to reserve a natural term to refer to the concept that is in use in several settings already as I outlined above and because it more strongly agrees with my conception of these things. Then, if you want to use "event" in your code and terminology, you are free to do so, as we won't standardize that term at all and nothing says you have to use "measurement" or "observation". Do you have a strong objection to this? |
Alternatively, I'm also happy banishing both terms from any official terminology and just referring to the main schema as "meds_schema" or something like this. |
This looks good. Feel free to merge it in |
No description provided.