-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BEP044: Within-stimuli conditions #153
Comments
I would have made it
stim_onset/stim_offset - I guess could be added but would have redundant information which could be computed (and validated to not go beyond stimuli duration) from "Movie starts" for that stimuli and corresponding onset and duration. And we all know what happens when there is redundancy ;) As for the hierarchical description of events -- isn't there https://bids-specification.readthedocs.io/en/latest/99-appendices/03-hed.html ? (never used it myself though) |
I'm not crazy about either of the solutions proposed above because, while both compliant with the current spec, neither one eliminates the fundamental ambiguity here, which is that you don't know which part of the clip is being presented. It also is kind of problematic from a BIDS-StatsModel standpoint, because it will cause almost all users to have to drop a The benefit of having optional The more I think about this, the more I lean towards maybe keeping the current approach and not codifying this at all in the Should we just say this is in the 20% (really more like 1%) and not worry about it? |
BTW, ... Do they actually would need to filter then out? Why don't you want them to model that entire "super" condition as well? If there are different movie cuts, you might want them explicitly in the model, even if only to absorb transition (if it visible) between different stimuli. If there is only one big one for the entire run - well, it will largely be your constant. If there design disbalance and stimuli files have subtle unique features to them (differently trimmed, color scheme, audio volume level), having them modeled might save us from one other possible retraction. The only problem I see is if all the trials follow each other in such a way that model becomes degenerate if the whole stim file condition is present too. The only cons is that may be those stimuli onset and duration are actually of interest to other tools, not just the linear model, so they would need to recompute them as well. But it shouldn't be too hard. As for extra unused meta data - I would say the more the merrier. My main concern is the fear of it being redundant and this requiring "manual" recomputation if I find that eg I need to fix onset. Then I will forget and the stimuli onset value will no longer be valid |
I would agree this probably falls into the 1% as the majority of experiments don't have sub-conditions within a stimuli. And so in 90% of cases, the mention of a stimulus indicates a complete presentation, so this is such a rare situation its probably not worth putting in the spec itself. I still think it might be worth clarifying that including a stimulus in |
i'm not sure this is 1%. in many standard experiments, there are sub conditions. for example in experiments that involve showing faces/objects there are often sub categories: emotions, types of objects, types of faces (human faces/animal faces). in fact the modified hariri task is a perfect example of this, and gets used by emotion/mood researchers a lot. i don't think we should reinvent ontologies of stimuli (e.g., paradigms - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3682219/, audio - https://research.google.com/audioset/ontology/index.html, images - https://bioportal.bioontology.org/ontologies/BIM). but provide a way where stimulus properties can be encoded appropriately.
i like the idea of a json going alongside a stimulus file, but this json should be able to reflect timed objects inside it. |
just to follow up:
|
@satra by "sub-conditions" here we're not talking about hierarchical organization, we're talking about a temporal subset of a single file. Codifying hierarchical structures is IMO not in scope, but in any case presents no particular challenge from an |
I think this is analogous to the movie example, but I still think it's an edge case. Situations where researchers dynamically crop images are likely to be pretty rare; in most cases, the cropping will have been done in advance, and what's in the I think a reasonable way to update the spec is to strongly encourage users to provide files in |
@tyarkoni - sorry i misunderstood the within stimuli conditions, so please ignore the ontological variations (although see last paragraph below). for movie, i'm thinking of things like commercial clips that are shown, and i'm sure that certain clips cannot be shared. for movies as an example are you saying i can extract faces, then specific emotions on those faces, and then encode both face and face+emotion in the events file, kind of a redundant stimulus list. all possible events in trial_type and then the analyst figures out which trials are of interest? for many of our tasks, that would work pretty well. |
I don't know that we can do anything about this, short of asking people to provide a description of where/how to obtain stimuli that can't be publicly shared. I don't think it's worth trying to codify this—there's too much variability in what that procurement process could look like.
Sure, you can create arbitrary columns in |
This is an old one. I wonder if HED tags can help with such issue. @VisLab do you have some opinion on this? |
As it turns out the HED Working Group has been discussing this very issue and some of our members will weigh in shortly with a concrete proposal --- @neuromechanist @dorahermes @tpatpa @dungscout96 @monique2208 @makeig |
Yes, I agree that HED tags can come in useful here and probably tackle this issue. When working through an example it seems like this may be a relatively larger contribution with some added machine readable files in the @neuromechanist could share a preliminary google doc (not BEP yet, just the examples we were working through) if that would help give an idea? Tagging some people who previously contributed to this discussion for input: @adelavega @tyarkoni @yarikoptic @satra @Remi-Gau |
if we are talking about a BEP to help organize stimuli then there is overlap with : #751 |
Reading here and #751 resonates closely with the challenges we are exploring for including image and movie annotations into a couple of massive datasets we are working on. In both projects, we see the need for top-level annotation files that would be used in the downstream In this Google Doc, we are exploring the possibility of a file such as A sample
If the stimulus file has a time-varying context (such as a movie), a separate We believe this method will make the annotation of stimulus files more reusable; researchers can reuse the stimulus files and select the We appreciate your thoughts and comments on the Google Doc, as well as here. Our use cases are limited to a couple of visual and audiovisual stimuli. Many other stimulation types may require other arrangements. We appreciate that you also include examples of other stimulus types, if possible. |
@bids-standard/maintainers would be great to hear your thoughts on whether this is worthy of a small BEP, thank you! |
Maybe not a BEP but several small orthogonal pull requests? I can try to bring it up at the next maintainers meeting. |
Following hed-standard/hed-python#810, it seems that expanding the As described in the HED issue above and also in the GDoc we are drafting for this issue, there could be two variations of this issue:
A working example for the second case, which is the main focus of this issue, is the following scenario: The events for the Present movie are limited to the start and stop of the video:
However, it is clear that a movie contains far more events, and researchers would desire to provide their annotations based on their application. As a straightforward example, we identified the shot transition events and quantified the Log Luminance Ratio of this shot transition. The file included in the dataset as
To merge the
This implementation is far from perfect, but it could serve as a working example of the implications of this mechanism for large and very large datasets. The Healthy Brain Network Project spans over 7000 subjects with EEG and fMRI, and this mechanism will help dynamically use event annotations based on the research's use case. |
I haven't had time to look at the entire proposal in detail, but overall the concept of annotating stimuli seperately from the |
Following 4/12's conversations with @Remi-Gau, @adelavega, @yarikoptic, @arnodelorme and @dungscout96, there is quite an enthusiasm to provide structure for the @yarikoptic and I jotted on the Google Doc to modify the suggestions to a (directory-less) BIDS naming structure, which also follows the ideas in #751. Based on the Google Doc example, here is a draft suggestion: stim-present_???.mp4|mkv|jpg|png
stim-present_???.json
[stim-present_annot-loglum_events.tsv]
[stim-present_annot-loglum_events.json]
…
stimuli.tsv
stimuli.json
TODO:
CC @VisLab, @dorahermes, and @monique2208 for comment. |
Looks good, but I'm concerned that mandating stimuli have a specific name would make this backwards incompatible w/ existing datasets (which name stimuli files whatever they want, and just refer to them in the It's a minor concern, but it just seems slightly out of scope to mandate a new way to name stimuli files. Would this required overall even if you do not have annotations? |
Seems like there was discussion regarding the top level |
Not sure the proposal has to be backwards incompatible: Now: Potential proposal: the above stays the same... but... In the Suppose that the stimulus file is a movie with annotations then in The directory structure within the |
Current contenders for the stimuli modality suffix include:
Feel free to let me know if you have any other suggestions and which one you prefer, so I can update the list. |
|
In the spirit of the future BIDS 2.0 with e.g.
|
Ok,sounds great. It seems that proposing The suffix may need more consideration. Currently, Just a note that there is already |
Also, should we convert this issue to a BEP? Converting to BEP hopefully makes the enhancements more visible and maintainable (although, it will also require more work). Talking to @yarikoptic and @dorahermes, they both seem to support a BEP for this issue. |
Added PR #1814 to add stimulus and annotation entities and the The next steps would require inputs for:
|
It would be great to have this formalized! We have a large number of datasets where we present the same short movie as a localizer. Having one general annotation file which could apply to all of these datasets would really help with the analysis, it would remove a lot of redundancy in the event files and and I think it would provide something interesting to share on its own. |
Our suffixes so far can correspond to a number of things, but most typically quite specific to "data modality", so here we might want to be more specific too, e.g. have |
+1 for |
@yarikoptic, @dorahermes and I will meet on Tuesday 8/13 at 10 am PT to discuss the progress and the next steps. Please reach out to me if you want to join the conversation and I'll share the meeting details. |
Probably we should include |
@yarikoptic, @dorahermes, @TheChymera, and I joined the meeting. We agreed that the broad scope of the changes (including adding a prefix, a couple of entities, and suffixes) and their usability in several fields (EEG, fMRI, EEG, ...) justifies requesting a BEP. @bids-maintenance, could you help raise this issue and elevate it to a BEP? A couple of other discussion points during the meeting were: 1) Adopting The main discussion is on this Google Doc. The next meeting will be on August 27th at 10 a.m. PT. |
FTR: requesting BEP044 for this effort: |
The second meeting with @yarikoptic, @dorahermes, @VisLab, and I was held on 8/27, with discussions on removing |
We are officially BEP044. Congratulations to everyone for their hard work and persistence on this issue and topic 🎉. @bids-maintenance, @adelavega, do you mind updating the issue name to reflect the BEP number? Thanks a lot |
We made significant progress with the biweekly meetings. Starting next week, September 17th, the meetings will be at 9 am ET/ 3 pm CET, biweekly on Tuesdays, to ensure a more suitable timing for everyone. Please let me know if you would like to join. |
We discussed provisioning |
It would be useful to have some discussion on the use of |
IMHO it would be ok to offer relaxation of the semantic here to make it not MRI specific and reuse |
There is the existing
This seems to fit the use-case, although the definition should be made less hyper-specific. As to |
We considered available terms with similar meanings, including, We should, however, allow for overlapping or any other configuration that researchers would need (such as select book chapters, positive/negative valence) under the same stimulus id. In the Forrest Gump movie example, there is a few-second overlap between each of the eight parts and the next/previous part, so the stimulus and the annotations are not split per se, but rather parts. I'd also like to ask if we should consider introducing a new entity instead of Today, @dorahermes, @VisLab, and I reviewed the examples bids-standard/bids-examples#433, with the most recent changes and also added HED annotation for a couple of |
@dorahermes, @VisLab and I discussed the remaining work for this BEP:
We do not plan to hold any more biweekly meetings. Thanks to all who contributed to and supported this effort 🙌🏼. |
stim_file
columns inevent
files allow users to specify which stimuli files are associated with an event onset:However, what this does not allow for is the specification of sub-conditions that occur during a long-running stimulus.
For example, in ds001545 a video file is presented which spans the entirety of the run. However, within each run/video there are 6 distinct conditions.
For example:
IMO, the above example is invalid as the
stim_file
only has a single onset.The following is an event file which has all the necessary information (note I'm having to guess when the onset of the
stim_file
is, it could actually be0
).However, this is ambiguous as the conditions are only implied to occur during stimulus presentation due to the duration of the first row.
@tyarkoni suggests adding optional but strongly encouraged
stim_onset
andstim_offset
columns. These would denote onsets within a stimulus.The text was updated successfully, but these errors were encountered: