-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stimuli BEP #751
Comments
At one point I talked to @Gilles86 about how he was storing stimuli, but don't clearly recall how deep we went. He might have some thoughts here. Just to comment on one thing, I'm not sure Something like |
|
I also hate boilerplate , and indeed in many use cases which might not even really need directories needed at all. BUT I can see stimuli collections where each stimuli could have a good number of files (audio, audio/video, images, etc) associated with that stimuli category; thus would be beneficial for organization and also navigation and reuse (clear "module" for a stimuli at the directory level). So, again, similarly to neuroimaging datasets where having just a single T1w image per subject, it might be sensible to have per-label directories. (moreover there could be multiple samples of the same label -- so semantically similar to PS although even may be |
Hmm. Okay, fair enough. I guess the question is how much is this supposed to be BIDS-like or is it supposed to be BIDS? That matters for what entity names are chosen, since if it is BIDS, then we can't change the meaning of an entity too far. If it's just BIDS-like, then we can choose entities that are appropriate for stimuli with little regard for BIDS' existing definitions or ones that are likely to be claimed by future BEPs. I would probably prefer BIDS-like, since a subject or a recording session is integral to a lot of definitions. So here's a notion:
An alternative (or addition) to
Then we presumably need a |
Thank you @effigies !!! I feel like we are on the same page and progressing leaping forward ;)
what entities meaning you see needing much of adjustment? Even for run I feel we would not need much of adjustment although some might already be a bit overdue: filed #760 . So not sure if we really need to introduce
+1 on that. Additional thoughts: But then I see the point of having top level
|
It feels like shoehorning an experimental notion into a corpus description. I would rather step back and think about what would make a good corpus standard with minimal reference to BIDS. Maybe if you're thinking of the generation of the stimuli as a procedure that is repeated multiple times, run works. But perhaps I'm sampling from a larger corpus where the notion doesn't apply (e.g., going back through BBC archives for different pronunciations of words).
Yeah |
Quick thought to point out what @sappelhoff mentioned regarding subject specific stimuli here: #750 (comment) I don't think that it will such a rare case and we should probably give that some thought. If it is just a matter of a raw stimulus being adapted to each participant, this could be treated as derivatives but having a way to describe the subject the stimulus is for would be a good thing. Use the
Reuse the
|
Also does it make sense to have "prefix" or is it really shoehorning too much BIDS into this? |
RE: sub entities, is the stimulus truly related to the subject, or is it that each subject gets a different stimulus? For a stimulus dataset that needs to be able to be understood in isolation, I'm wary of infecting with a separate notion. For example, maybe I created the stimuli for the subjects in a particular study, but then I want to perform a second study with the same stimuli, and the I would suggest that this would be a good use case for
I don't really understand this question. Could you clarify? |
Yes but...
I will get specific to better explain. So the case I have in mind the stimuli are literally made for each participant: participants are presented with sounds played from different locations, the sounds are recorded with microphones placed next to their ears so that the sound can replayed to them in the scanner as if they were listening to sound coming from that very specific location. Each person has their own "head related transfer function" that filters the sound in a given way, so each participant has their own set of sounds. This is very much related to a given dataset so in most cases it won't work in isolation from the data. But even if you "ship" the stimuli with the BIDS dataset I am wondering if it would make sense to worry about this to the level of having an entity that "pairs" a stimulus to a subject. Sort of thinking this is in the 20% of our pareto principle. |
My two cents: this feels to me like way more trouble than it's worth. Just give each stimulus a unique I also think this (i.e., stimulus naming/encoding) is a big and important enough problem it could easily be spun off into its own non-BIDS spec, and just be wrapped later by BIDS. |
ReCorDS (Research Corpus Data Structure)? |
as many files in BIDS are of the form |
Makes sense: when I wrote my last reply, I started thinking of an optional "intended for" or something equivalent instead of an entity
Agreed. That is definitely something where I would like to hear the opinion of the psych-DS folks for example. |
Thanks everyone! I think this discussion, along with #750, resonates also with the recent discussion of BEP032 where
I think with such generalization, it would allow for establishing BEPs like this, as easily as adding a few (if any) missing entities, and "vetting" a "new" hierarchy layout. With ongoing effort by @tsalo in formalizing the schema, any BIDS tool using that schema, would be able to immediately support such a "novel" layout. The interesting and important questions would be on what metadata to include. Sorry if I derailed a bit ;-) |
oh (sorry for the dump) - I just realized, that it generalizes very nicely for what many (myself included) were missing: per entity level specific metadata, and in general it is
where
BUT, it generalizes nicely into
so people come up with ad-hoc cross-product of the two with session or session_id column and either just id or with `ses-` values(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro[master]
$> grep -A2 session ds*/participants.tsv
ds001541/participants.tsv:participant_id session run1 run2 run3 run4 Viral_infusion_date MRI_acquisition_date weight group day_post_infusion gender viral_vector
ds001541/participants.tsv-562 2 33 100 66 n/a 2014-01-09 2014-03-10 30.8 exp 60 male ChR2-eYFP
ds001541/participants.tsv-562 1 100 g100 g100 100 2014-01-09 2014-03-11 30.6 exp 61 male ChR2-eYFP
--
ds001653/participants.tsv:participant_id session gender weight acquisition_date breathing_rate condition
ds001653/participants.tsv-sub-jgrAesAWc11R1L ses-1 f 20.6 2017-08-11 150 awake
ds001653/participants.tsv-sub-jgrAesAWc12R ses-1 f 22.4 2017-08-11 240 awake
--
ds001890/participants.tsv:participant_id session sex genotype Weight SpO2 HR Temperature DOB Experiment_Date Age
ds001890/participants.tsv-c1NT 1 M 3xTG 32.3 98 272 35.8 2016-11-22 2017-03-23 3
ds001890/participants.tsv-c1NT 2 M 3xTG 36.2 94 311 35.8 2016-11-22 2017-05-31 6
--
ds002134/participants.tsv:participant_id session genotype virus age sex Weight Temperature DOB Surgery_date Experiment_Date run-1 run-2 run-3 run-4
ds002134/participants.tsv-jgroptoAD100 1 C57BL/6 mCherry 3 M 30 36.3 2018-12-11 2019-04-01 2019-04-20 n/a 10 20 5
ds002134/participants.tsv-jgroptoAD101 1 C57BL/6 mCherry 3 M 29.6 36.6 2018-12-11 2019-04-01 2019-04-20 n/a 5 10 20
--
ds002154/participants.tsv:participant_id session gender condition weight Experiment_Date
ds002154/participants.tsv-1 1 m veh 29.3 2015-10-14
ds002154/participants.tsv-1 2 m psi05 29.3 2015-10-14
--
ds002307/participants.tsv:participant_id DOB rs-fMRI 1 rs-fMRI 2 rs-fMRI 3 rs-fMRI 4 rs-fMRI 5 rs-fMRI 6 rs-fMRI 7 excluded_rs-fMRI_sessions dMRI
ds002307/participants.tsv-Ey112 20160126 20160415 20160417 20160418 20160419 20160421 20160424 20160425 x 20160426
ds002307/participants.tsv-Ey113 20160126 20160415 20160417 20160418 20160419 20160421 20160424 20160425 x 20160426
--
ds002547/participants.tsv:participant_id sex age validation_session
ds002547/participants.tsv-sub-01 F 24.0 1.0
ds002547/participants.tsv-sub-02 M 21.0 1.0
--
ds002995/participants.tsv:participant_id weight age gender num_sessions
ds002995/participants.tsv-sub-007 68 24 F 1
ds002995/participants.tsv-sub-008 70 22 F 2
--
ds003416/participants.tsv:participant_id session_id sex age handedness
ds003416/participants.tsv-cIs1 s1Ax1 male 25 left
ds003416/participants.tsv-cIs1 s1Ax2 male 25 left
--
ds003464/participants.tsv:participant_id session genotype virus Experiment_Date sex weight delta_preference
ds003464/participants.tsv-jgroptoINS501 2 C57BL/6 ChR2-mCherry 2018-09-11 M 32 n/a
ds003464/participants.tsv-jgroptoINS503 2 C57BL/6 ChR2-mCherry 2018-07-25 M n/a 0.11
--
ds003470/participants.tsv:participant_id session_id age sex size weight
ds003470/participants.tsv-sub-01 ses-1 26 F 1.63 55
ds003470/participants.tsv-sub-02 ses-1 18 M 1.82 67
So overall generalization could be
|
I should have looked into @psych-ds earlier. Initiated some dialog on Psych-DS spec google doc. Indeed might align nicely if we could allow for different layout (not |
Psych-DS 'maintainer' here (we have a tech spec, no released validator software yet) Psych-DS is very firmly in the "BIDS-like" rather than "BIDS" category, and one of the main differences at least in v1 is we are not enforcing ordering of the key-value pairs in directory or filename structure. A possible use case would be the ability to take a BIDS dataset and "compile out" the behavioral task data, e.g. for an existing pipeline designed for out-of-scanner analysis of task data, or conversely, "compiling in" task data that's collected in a non-BIDSlike form but that is associated with BIDS data. Psych-DS is scoped primarily for behavioral data rather than stimuli, but I think there's no particular reason there couldn't be other clear paralells |
One point to note is that Psych-DS uses/will use JSON-LD metadata, i.e Schema.org/Dataset. A stimulus set version of Psych-DS would probably want to use some other kind of combo of CreativeWork, ImageObject etc |
FWIW, added a stub for possible BIDS 2.0 development: bids-standard/bids-2-devel#54 |
As @Remi-Gau hinted by #695 , we still lack total clarity on original stimuli storage and annotation.
We do have
stimuli/
folder which, likesourcedata/
is nohow "prescribed" for a specific structure.stim_file
column in_events.tsv
as to point to a (unregulated) location understimuli/
and then populating thatstim_file
description/HED tags within_events.json
(bless the inheritance principle)._events.json
as possibly coming from some DB_stim.tsv.gz
files for "signals related to the stimulus" (but not necessarily stimulus)In respect to the first 3 items, and in conjunction with
I wondered if there either an ongoing effort to standardize "stimuli datasets" so
stimuli/<name>
and avoiding necessity to describe stimuli in_events.json
since information could be picked from their standardized layoutWith that in mind I am even thinking such datasets could follow BIDS mantra and just get "participant/subject" and
sub-
renamed to "stimulus"/stim-
, and preserveREADME.md
,dataset_description.json
,stimuli.tsv
etcWorth a BEP/effort or may be it is already a "solved problem"? ;) WDYT?
Related:
_stim.{mp3.mp4,...}
along side with neural data. sidecar fields in that file could come handy why instrumenting specification of stimuli as presented from the shared across subjectssourcedata/
(e.g.StartTime
, possiblyTimeDrift
between scanner and stimuli delivery for lengthy (an hour) presentation, etc)The text was updated successfully, but these errors were encountered: