-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defining derivatives in a chain #8
Comments
|
{
"PreprocessingChain": [
"downsampling": {
"description": "lorem ipsum",
"Sources": "bids::<source_entities>_<suffix>.<ext>",
"Anti-aliasing-filter": "<link to key-value pair in SoftwareFilters>",
"Method": "downsampling method (e.g., decimation --> taking every nth sample; or something else)",
"SamplingRate": 300
},
"filtering": {
...
}
]
} |
no "custom fields" {
"PreprocessingChain": [
{
"Description": "downsampling 250 hz", [MANDATORY]
"Sources": "bids::<source_entities>_<suffix>.<ext>", [MANDATORY]
"<some fixed, defined name (tbd) indicating more info here>": { [OPTIONAL]
"foo": "bar",
"SamplingRate": 300
}
},
{
"Description": "HP filtering 1hz",
...
}
]
} |
Note - if the processing in the chain updates fields in the original _ieeg.json, these fields should be updated or removed. |
"GeneratedBy": [
{
"Name": "downsampling", [MANDATORY]
"Description": "downsampling 250 hz", [OPTIONAL]
"Sources": "bids::<source_entities>_<suffix>.<ext>", [OPTIONAL]
"<some name indicating more info here>": { [OPTIONAL]
"foo": "bar",
"SamplingRate": 300
}
},
{
"Name": "filtering", [MANDATORY]
"description": "HP filtering 1hz", [OPTIONAL]
...
}
] |
"<ProcessingChain>": [
{
"Name": "downsampling", [OPTIONAL]
"Description": "Downsampling data at 250hz", [OPTIONAL]
"Version": "0.1, [OPTIONAL]
"Container": { [OPTIONAL]
"foo": "bar",
"SamplingRate": 300
}
"Sources": ["bids:<raw>:sub-001/eeg/xxx_eeg.edf"]
},
{
"Name": "Filtering", [MANDATORY]
"Description": "HP filtering 1hz", [OPTIONAL]
"Sources": ["bids::sub-001/eeg/xxx_desc-downsample_eeg.edf"]
}
] |
This file should be named xxx_desc-filtered+downsampled+ICA+epoch_eeg.prov.jsonld next to xxx_desc-filtered+downsampled+ICA+epoch_eeg.set Our recommendation: that we use JSON and that the validator ensures this is a compatible file (that can be converted to a graph). {
"@context": "https://raw.githubusercontent.com/bids-standard/BEP028_BIDSprov/master/context.json",
"BIDSProvVersion": "dev",
"records": {
"prov:Agent": [
{
"label": "EEGLAB",
"version": "v2023"
}
],
"prov:Activity": [
{
"@id": "xxxx1",
"Label": "filtering the data at 0.1 Hz",
"Used": "bids:<raw>:sub-001/eeg/xxx_eeg.edf"
},
{
"@id": "xxxx2",
"Label": "downsampling the data at 250 Hz",
"Used": "bids::sub-001/eeg/xxx_desc-filtered.set"
},
{
"@id": "xxxx3",
"Label": "running ICA using Picard",
"Used": "bids::sub-001/eeg/xxx_desc-filtered+downsampled_eeg.set"
},
{
"@id": "xxxx4",
"Label": "extracting epochs from -500 ms to 1000 ms",
"Used": "bids::sub-001/eeg/xxx_desc-filtered+downsampled+ICA_eeg.set"
}
],
"prov:Entity": [
{
"AtLocation": "bids::sub-001/eeg/xxx_desc-filtered_eeg.set",
"GeneratedBy": "xxxx1"
},
{
"AtLocation": "bids::sub-001/eeg/xxx_desc-filtered+downsampled_eeg.set",
"GeneratedBy": "xxxx2"
},
{
"AtLocation": "bids::sub-001/eeg/xxx_desc-filtered+downsampled+ICA_eeg.set",
"GeneratedBy": "xxxx3"
},
{
"AtLocation": "bids::sub-001/eeg/xxx_desc-filtered+downsampled+ICA+epoch_eeg.set",
"GeneratedBy": "xxxx4"
}
]
}
} |
After a discussion with Camille and Dora, this is what it could look like. There are mandatory fields (command, etc...) not included, so this is not even compliant with the beta BIDS provenance version. {
"@context": "https://purl.org/nidash/bidsprov/context.json",
"BIDSProvVersion": "1.0.0",
"@id": "bids:<raw>:sub-001/eeg/xxx_desc-filtered+downsampled+rereferenced_eeg.edf",
"wasGeneratedBy": {
"Label": "Average referencing the data"
},
"wasAssociatedWith": {
"Label": "EEGLAB",
"Version": 1
},
"used": {
"@id": "bids::sub-001/eeg/xxx_desc-filtered+downsampled_eeg.edf",
"wasGeneratedBy": {
"Label": "Downsampling the data at 250 Hz"
},
"used": {
"@id": "bids::sub-001/eeg/xxx_desc-filtered_eeg.edf",
"wasGeneratedBy": {
"Label": "High pass filtering the data at 1 Hz"
},
"used": {
"@id": "bids:<raw>:sub-001/eeg/xxx_eeg.edf"
}
}
}
} |
Trying to vizualize this via: https://github.com/bids-standard/BEP028_BIDSprov/tree/31c53505a7ebd16ede936720a8f114cd117d24e3/bids_prov#notes
|
|
I'm working on the guidelines and it is mentioned that jsonld is not mandatory, still if not used we need to document the chain in simple terms - the discussion was along the lines of
|
I think it is still in flux mostly because BIDS-provenance is not finalized. The consensus was that we would work with BIDS-provenance people to make the format simpler to use.
Arno
|
so no more preproc.json? |
Anything more than that can use prov.
|
Yes, no more but Robert can comment
|
There would indeed not be a Machine readable details about the processing go in the |
ah yes thx! |
I have a FieldTrip example that shows how it would look like, although I did not add the |
i wish I was with you guys :-( |
The example is available from https://surfdrive.surf.nl/files/index.php/s/M9KiX2r9DcW7ujI The It does not yet include the |
In https://bids.neuroimaging.io/bep023 (PET) I used the same approach but I also capture the chain, again is a non full provenance way, thx to free text
|
@robertoostenveld since one uses desc- I'm guessing 1st column should be desc-id and not description_id (as in your file) |
As it is For easy access to all, my
The actual order of the steps (preproc, avg, planar, combined) cannot yet be derived from the filenames or |
Note: as a follow-up to the BIDS-Prov examples we worked on together in Copenhagen, an updated version is now available in the BIDS-Prov repo: https://github.com/bids-standard/BEP028_BIDSprov/blob/master/examples/simple_example/simple_example.prov.jsonld |
NOTE This stems from the bids derivatives workshop Copenhagen 2023
How to define derivatives in a "chain"?
Usually this is done via the Sources metadata. However, it might be nice to document the processing chain more explicitly, for example in the entities:
<source_entities>_desc-downsampled_<suffix>.<ext>
<source_entities>_desc-downsampled+filtered_<suffix>.<ext>
--> problem: this would result in very long filenames fairly quickly.
Alternatively, one could have much shorter labels for the preprocessing steps:
<source_entities>_desc-ds+filt_<suffix>.<ext>
--> problem: short labels like "ds" may be too general (i.e., take up a lot of "namespace")
Another alternative would be to have "generic" desc labels with an "inbuilt" index (
label1
,label2
, etc.):<source_entities>_desc-preproc1_<suffix>.<ext>
<source_entities>_desc-preproc2_<suffix>.<ext>
<source_entities>_desc-preproc3_<suffix>.<ext>
paired with a:
<source_entities>_<suffix>.json
that is organized as:
--> Important note: order in a JSON object should have a meaning ... if it doesn't (to be clarified), we might need to use a JSON array.
More complete example:
The text was updated successfully, but these errors were encountered: