From 6d7eb0fa8e40e481440976a353bb02d451935db5 Mon Sep 17 00:00:00 2001 From: Taylor Salo Date: Tue, 27 Jun 2023 04:55:56 -0400 Subject: [PATCH] [SCHEMA] Add full object definitions for valid values in schema (#919) * Add some descriptions. Mostly just TODOs for now. * Start rendering value restrictions in descriptions. * Fix. * Revert rendering. Will move to separate PR. * Run black. * Add more enum descriptions. * Add nonstandard and non-template coordinate systems. I'm not 100% sure about this one. I grabbed values from the coordinate systems appendix. * Fix linting issues. * Run prettier. * Make the expanded enums into objects. * Fix up rendering, hopefully. * Update metadata.yaml * Update render.py * whoops. * Update render.py * Whoops! * Finish resolving conflicts. * Move enums to a separate file. * Fix on and off. * Remove duplicate keys. * Start adding channel type values. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add EEG columns. * Remove empty entries. * Add MEG channels. * Add NIRS channels. * Try fixing the formatting. * Fix. * OTHER sex isn't the same as OTHER channel type. * Consolidate channels enums across datatypes. * Replace name with value. * Replicate #1478. * Add .value to value refs. * Add tags to channel values. * ENH: Allow for dereferencing individual values in schema * Rename values to enum. * Rename enum to enum_values. * Fix test. * Rename and move enums around. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update columns.yaml * Document new sub-namespace. * Fix table. * Remove unused enum groups. * Apply suggestions from code review Co-authored-by: Chris Markiewicz * Update the other direction enum defs. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Chris Markiewicz Co-authored-by: Stefan Appelhoff --- src/schema/README.md | 14 +- src/schema/objects/columns.yaml | 199 +-- src/schema/objects/entities.yaml | 16 +- src/schema/objects/enums.yaml | 1089 +++++++++++++++++ src/schema/objects/metadata.yaml | 236 ++-- src/schema/rules/files/raw/meg.yaml | 2 + src/schema/rules/tabular_data/eeg.yaml | 4 +- src/schema/rules/tabular_data/ieeg.yaml | 4 +- src/schema/rules/tabular_data/meg.yaml | 4 +- src/schema/rules/tabular_data/motion.yaml | 4 +- src/schema/rules/tabular_data/nirs.yaml | 4 +- .../schemacode/bidsschematools/render/text.py | 44 +- .../bidsschematools/render/utils.py | 8 +- tools/schemacode/bidsschematools/rules.py | 9 +- tools/schemacode/bidsschematools/schema.py | 38 +- .../bidsschematools/tests/test_schema.py | 33 + 16 files changed, 1356 insertions(+), 352 deletions(-) create mode 100644 src/schema/objects/enums.yaml diff --git a/src/schema/README.md b/src/schema/README.md index 94cd868d7e..b61740bbcc 100644 --- a/src/schema/README.md +++ b/src/schema/README.md @@ -336,7 +336,7 @@ or whether objects are required in a given dataset or file. ### Overview -There are currently 11 sub-namespaces, which fall into five rough categories. +There are currently 12 sub-namespaces, which fall into five rough categories. The namespaces are: @@ -352,6 +352,7 @@ The namespaces are: | `objects.extensions` | Filename component that describe the format of the file | Value terms | | `objects.formats` | Terms that define the forms values (for example, in metadata) might take | Formats | | `objects.files` | Files and directories that may appear at the root of a dataset | Files | +| `objects.enums` | Full descriptions of enumerated values used in other sub-namespaces | Value terms | Because these objects vary, the contents of each namespace can vary. @@ -554,6 +555,13 @@ The convention can be summed up in the following rules: | `description` | Term definition | | `file_type` | Indicator that the file is a regular file (`"regular"`) or directory (`"directory"`) | +- `objects.enums` + | Field | Description | + | -------------- | ---------------------- | + | `display_name` | Human-friendly name | + | `description` | Term definition | + | `value` | String value of `enum` | + ## Rule files The `rules.*` namespace contains most of the validatable content of the schema, @@ -910,11 +918,11 @@ EEGChannels: - extension == ".tsv" initial_columns: - name__channels - - type__eeg_channels + - type__channels - units columns: name__channels: required - type__eeg_channels: required + type__channels: required units: required description: optional sampling_frequency: optional diff --git a/src/schema/objects/columns.yaml b/src/schema/objects/columns.yaml index b4b32c41a5..bd1ab1478e 100644 --- a/src/schema/objects/columns.yaml +++ b/src/schema/objects/columns.yaml @@ -183,8 +183,8 @@ hemisphere: The hemisphere in which the electrode is placed. type: string enum: - - L - - R + - $ref: objects.enums.left_hemisphere.value + - $ref: objects.enums.right_hemisphere.value high_cutoff: name: high_cutoff display_name: High cutoff @@ -603,142 +603,7 @@ trigger: continuous measurement of the scanner trigger signal type: number # type column in channels.tsv files -type__eeg_channels: - name: type - display_name: Channel type - description: | - Type of channel; MUST use the channel types listed below. - Note that the type MUST be in upper-case. - type: string - enum: - - AUDIO - - EEG - - EOG - - ECG - - EMG - - EYEGAZE - - GSR - - HEOG - - MISC - - PPG - - PUPIL - - REF - - RESP - - SYSCLOCK - - TEMP - - TRIG - - VEOG -type__meg_channels: - name: type - display_name: Channel type - description: | - Type of channel; MUST use the channel types listed below. - Note that the type MUST be in upper-case. - type: string - enum: - - MEGMAG - - MEGGRADAXIAL - - MEGGRADPLANAR - - MEGREFMAG - - MEGREFGRADAXIAL - - MEGREFGRADPLANAR - - MEGOTHER - - EEG - - ECOG - - SEEG - - DBS - - VEOG - - HEOG - - EOG - - ECG - - EMG - - TRIG - - AUDIO - - PD - - EYEGAZE - - PUPIL - - MISC - - SYSCLOCK - - ADC - - DAC - - HLU - - FITERR - - OTHER -type__ieeg_channels: - name: type - display_name: Channel type - description: | - Type of channel; MUST use the channel types listed below. - Note that the type MUST be in upper-case. - type: string - enum: - - EEG - - ECOG - - SEEG - - DBS - - VEOG - - HEOG - - EOG - - ECG - - EMG - - TRIG - - AUDIO - - PD - - EYEGAZE - - PUPIL - - MISC - - SYSCLOCK - - ADC - - DAC - - REF - - OTHER -type__nirs_channels: - name: type - display_name: Channel type - description: | - Type of channel; MUST use the channel types listed below. - Note that the type MUST be in upper-case. - type: string - enum: - - NIRSCWAMPLITUDE - - NIRSCWFLUORESCENSEAMPLITUDE - - NIRSCWOPTICALDENSITY - - NIRSCWHBO - - NIRSCWHBR - - NIRSCWMUA - - MEGMAG - - MEGGRADAXIAL - - MEGGRADPLANAR - - MEGREFMAG - - MEGREFGRADAXIAL - - MEGREFGRADPLANAR - - MEGOTHER - - EEG - - ECOG - - SEEG - - DBS - - VEOG - - HEOG - - EOG - - ECG - - EMG - - TRIG - - AUDIO - - PD - - EYEGAZE - - PUPIL - - MISC - - SYSCLOCK - - ADC - - DAC - - HLU - - FITERR - - ACCEL - - GYRO - - MAGN - - MISC - - OTHER -type__motion_channels: +type__channels: name: type display_name: Channel type description: | @@ -746,16 +611,54 @@ type__motion_channels: Note that the type MUST be in upper-case. type: string enum: - - ACCEL - - ANGACCEL - - GYRO - - JNTANG - - LATENCY - - MAGN - - MISC - - ORNT - - POS - - VEL + - $ref: objects.enums.ACCEL.value + - $ref: objects.enums.ADC.value + - $ref: objects.enums.ANGACCEL.value + - $ref: objects.enums.AUDIO.value + - $ref: objects.enums.DAC.value + - $ref: objects.enums.DBS.value + - $ref: objects.enums.ECG.value + - $ref: objects.enums.ECOG.value + - $ref: objects.enums.EEG.value + - $ref: objects.enums.EMG.value + - $ref: objects.enums.EOG.value + - $ref: objects.enums.EYEGAZE.value + - $ref: objects.enums.FITERR.value + - $ref: objects.enums.GSR.value + - $ref: objects.enums.GYRO.value + - $ref: objects.enums.HEOG.value + - $ref: objects.enums.HLU.value + - $ref: objects.enums.JNTANG.value + - $ref: objects.enums.LATENCY.value + - $ref: objects.enums.MAGN.value + - $ref: objects.enums.MEGGRADAXIAL.value + - $ref: objects.enums.MEGGRADPLANAR.value + - $ref: objects.enums.MEGMAG.value + - $ref: objects.enums.MEGOTHER.value + - $ref: objects.enums.MEGREFGRADAXIAL.value + - $ref: objects.enums.MEGREFGRADPLANAR.value + - $ref: objects.enums.MEGREFMAG.value + - $ref: objects.enums.MISC.value + - $ref: objects.enums.NIRSCWAMPLITUDE.value + - $ref: objects.enums.NIRSCWFLUORESCENSEAMPLITUDE.value + - $ref: objects.enums.NIRSCWHBO.value + - $ref: objects.enums.NIRSCWHBR.value + - $ref: objects.enums.NIRSCWMUA.value + - $ref: objects.enums.NIRSCWOPTICALDENSITY.value + - $ref: objects.enums.ORNT.value + - $ref: objects.enums.OTHER.value + - $ref: objects.enums.PD.value + - $ref: objects.enums.POS.value + - $ref: objects.enums.PPG.value + - $ref: objects.enums.PUPIL.value + - $ref: objects.enums.REF.value + - $ref: objects.enums.RESP.value + - $ref: objects.enums.SEEG.value + - $ref: objects.enums.SYSCLOCK.value + - $ref: objects.enums.TEMP.value + - $ref: objects.enums.TRIG.value + - $ref: objects.enums.VEL.value + - $ref: objects.enums.VEOG.value # type column for electrodes.tsv files type__electrodes: name: type diff --git a/src/schema/objects/entities.yaml b/src/schema/objects/entities.yaml index 83e1fa0826..3260b4a3d4 100644 --- a/src/schema/objects/entities.yaml +++ b/src/schema/objects/entities.yaml @@ -132,8 +132,8 @@ hemisphere: type: string format: label enum: - - 'L' - - 'R' + - $ref: objects.enums.left_hemisphere.value + - $ref: objects.enums.right_hemisphere.value inversion: name: inv display_name: Inversion Time @@ -184,8 +184,8 @@ mtransfer: type: string format: label enum: - - 'on' - - 'off' + - $ref: objects.enums.on__mtransfer.value + - $ref: objects.enums.off__mtransfer.value part: name: part display_name: Part @@ -207,10 +207,10 @@ part: type: string format: label enum: - - mag - - phase - - real - - imag + - $ref: objects.enums.magnitude.value + - $ref: objects.enums.phase.value + - $ref: objects.enums.real.value + - $ref: objects.enums.imaginary.value processing: name: proc display_name: Processed (on device) diff --git a/src/schema/objects/enums.yaml b/src/schema/objects/enums.yaml new file mode 100644 index 0000000000..16b2e6c618 --- /dev/null +++ b/src/schema/objects/enums.yaml @@ -0,0 +1,1089 @@ +--- +# This file defines valid values in BIDS key-value pairs. +_EEGCoordSys: + type: string + enum: + - $ref: objects.enums.CapTrak.value + - $ref: objects.enums.EEGLAB.value + - $ref: objects.enums.EEGLAB-HJ.value + - $ref: objects.enums.Other.value +_GeneticLevelEnum: + type: string + enum: + - $ref: objects.enums.Genetic.value + - $ref: objects.enums.Genomic.value + - $ref: objects.enums.Epigenomic.value + - $ref: objects.enums.Transcriptomic.value + - $ref: objects.enums.Metabolomic.value + - $ref: objects.enums.Proteomic.value +_MEGCoordSys: + type: string + enum: + - $ref: objects.enums.CTF.value + - $ref: objects.enums.ElektaNeuromag.value + - $ref: objects.enums.4DBti.value + - $ref: objects.enums.KitYokogawa.value + - $ref: objects.enums.ChietiItab.value + - $ref: objects.enums.Other.value +_StandardTemplateCoordSys: + type: string + enum: + - $ref: objects.enums.ICBM452AirSpace.value + - $ref: objects.enums.ICBM452Warp5Space.value + - $ref: objects.enums.IXI549Space.value + - $ref: objects.enums.fsaverage.value + - $ref: objects.enums.fsaverageSym.value + - $ref: objects.enums.fsLR.value + - $ref: objects.enums.MNIColin27.value + - $ref: objects.enums.MNI152Lin.value + - $ref: objects.enums.MNI152NLin2009aSym.value + - $ref: objects.enums.MNI152NLin2009bSym.value + - $ref: objects.enums.MNI152NLin2009cSym.value + - $ref: objects.enums.MNI152NLin2009aAsym.value + - $ref: objects.enums.MNI152NLin2009bAsym.value + - $ref: objects.enums.MNI152NLin2009cAsym.value + - $ref: objects.enums.MNI152NLin6Sym.value + - $ref: objects.enums.MNI152NLin6ASym.value + - $ref: objects.enums.MNI305.value + - $ref: objects.enums.NIHPD.value + - $ref: objects.enums.OASIS30AntsOASISAnts.value + - $ref: objects.enums.OASIS30Atropos.value + - $ref: objects.enums.Talairach.value + - $ref: objects.enums.UNCInfant.value +_StandardTemplateDeprecatedCoordSys: + type: string + enum: + - $ref: objects.enums.fsaverage3.value + - $ref: objects.enums.fsaverage4.value + - $ref: objects.enums.fsaverage5.value + - $ref: objects.enums.fsaverage6.value + - $ref: objects.enums.fsaveragesym.value + - $ref: objects.enums.UNCInfant0V21.value + - $ref: objects.enums.UNCInfant1V21.value + - $ref: objects.enums.UNCInfant2V21.value + - $ref: objects.enums.UNCInfant0V22.value + - $ref: objects.enums.UNCInfant1V22.value + - $ref: objects.enums.UNCInfant2V22.value + - $ref: objects.enums.UNCInfant0V23.value + - $ref: objects.enums.UNCInfant1V23.value + - $ref: objects.enums.UNCInfant2V23.value +_iEEGCoordSys: + type: string + enum: + - $ref: objects.enums.Pixels.value + - $ref: objects.enums.ACPC.value + - $ref: objects.enums.ScanRAS.value + - $ref: objects.enums.Other.value +left_hemisphere: + value: 'L' + display_name: Left Hemisphere + description: | + A left hemibrain image. +right_hemisphere: + value: 'R' + display_name: Right Hemisphere + description: | + A right hemibrain image. +CASL: + value: CASL + display_name: Continuous arterial spin labeling + description: | + Continuous arterial spin labeling was employed. +PCASL: + value: PCASL + display_name: Pseudo-continuous arterial spin labeling + description: | + Pseudo-continuous arterial spin labeling was employed. +PASL: + value: PASL + display_name: Pulsed arterial spin labeling + description: | + Pulsed arterial spin labeling was employed. +Separate: + value: Separate + display_name: Separate + description: | + A separate `m0scan` file is present. +Included: + value: Included + display_name: Included + description: | + An m0scan volume is contained within the associated `asl` file. +Estimate: + value: Estimate + display_name: Estimate + description: | + A single whole-brain M0 value is provided in the metadata. +Absent: + value: Absent + display_name: Absent + description: | + No specific M0 information is present. +TwoD: + value: 2D + display_name: Two-dimensional + description: | + Two-dimensional MR acquisition. +ThreeD: + value: 3D + display_name: Three-dimensional + description: | + Three-dimensional MR acquisition. +HARD: + value: HARD + display_name: Hard pulse + description: | + A very brief, strong, rectangular pulse. +GAUSSIAN: + value: GAUSSIAN + display_name: Gaussian pulse + description: | + A Gaussian pulse. +GAUSSHANN: + value: GAUSSHANN + display_name: Gaussian-Hanning pulse. + description: | + A Gaussian pulse with a Hanning window. +SINC: + value: SINC + display_name: Sinc pulse + description: | + A sinc-shaped pulse. +SINCHANN: + value: SINCHANN + display_name: Sinc-Hanning pulse + description: | + A sinc-shaped pulse with a Hanning window. +SINCGAUSS: + value: SINCGAUSS + display_name: Sinc-Gauss pulse + description: | + A sinc-shaped pulse with a Gaussian window. +FERMI: + value: FERMI + display_name: Fermi pulse + description: | + A Fermi-shaped pulse. +i: + value: i + display_name: i + description: | + The encoding direction is along the first axis of the data in the NIFTI file, + and the encoding value increases from the zero index to the maximum index. +j: + value: j + display_name: j + description: | + The encoding direction is along the second axis of the data in the NIFTI file, + and the encoding value increases from the zero index to the maximum index. +k: + value: k + display_name: k + description: | + The encoding direction is along the third axis of the data in the NIFTI file, + and the encoding value increases from the zero index to the maximum index. +iMinus: + value: i- + display_name: i- + description: | + The encoding direction is along the first axis of the data in the NIFTI file, + and the encoding value decreases from the zero index to the maximum index. +jMinus: + value: j- + display_name: j- + description: | + The encoding direction is along the second axis of the data in the NIFTI file, + and the encoding value decreases from the zero index to the maximum index. +kMinus: + value: k- + display_name: k- + description: | + The encoding direction is along the third axis of the data in the NIFTI file, + and the encoding value decreases from the zero index to the maximum index. +continuous: + value: continuous + display_name: Continuous recording + description: | + Continuous recording. +epoched: + value: epoched + display_name: Epoched recording + description: | + Recording is limited to time windows around events of interest + (for example, stimulus presentations or subject responses). +discontinuous: + value: discontinuous + display_name: Discontinuous recording + description: | + Discontinuous recording. +orig: + value: orig + display_name: orig + description: | + A (potentially unique) per-image space. + Useful for describing the source of transforms from an input image to a target space. +Brain: + value: Brain + display_name: Brain mask + description: | + A brain mask. +Lesion: + value: Lesion + display_name: Lesion mask + description: | + A lesion mask. +Face: + value: Face + display_name: Face mask + description: | + A face mask. +ROI: + value: ROI + display_name: ROI mask + description: | + A region of interest mask. +CapTrak: + value: CapTrak + display_name: CapTrak + description: | + RAS orientation and the origin approximately between LPA and RPA +EEGLAB: + value: EEGLAB + display_name: EEGLAB + description: | + ALS orientation and the origin exactly between LPA and RPA. + For more information, see the + [EEGLAB wiki page](https://eeglab.org/tutorials/ConceptsGuide/coordinateSystem.html#eeglab-coordinate-system). +EEGLAB-HJ: + value: EEGLAB-HJ + display_name: EEGLAB-HJ + description: | + ALS orientation and the origin exactly between LHJ and RHJ. + For more information, see the + [EEGLAB wiki page](https://eeglab.org/tutorials/ConceptsGuide/coordinateSystem.html#\ + eeglab-hj-coordinate-system). +Other: + value: Other + display_name: Other + description: | + Other coordinate system. +Genetic: + value: Genetic + display_name: Genetic + description: | + Data report on a single genetic location (typically directly in the `participants.tsv` file). +Genomic: + value: Genomic + display_name: Genomic + description: | + Data link to participants' genome (multiple genetic locations). +Epigenomic: + value: Epigenomic + display_name: Epigenomic + description: | + Data link to participants' characterization of reversible modifications of DNA. +Transcriptomic: + value: Transcriptomic + display_name: Transcriptomic + description: | + Data link to participants RNA levels. +Metabolomic: + value: Metabolomic + display_name: Metabolomic + description: | + Data link to participants' products of cellular metabolic functions. +Proteomic: + value: Proteomic + display_name: Proteomic + description: | + Data link to participants peptides and proteins quantification. +CTF: + value: CTF + display_name: CTF + description: | + ALS orientation and the origin between the ears. +ElektaNeuromag: + value: ElektaNeuromag + display_name: Elekta Neuromag + description: | + RAS orientation and the origin between the ears. +4DBti: + value: 4DBti + display_name: 4D BTI + description: | + ALS orientation and the origin between the ears. +KitYokogawa: + value: KitYokogawa + display_name: KIT/Yokogawa + description: | + ALS orientation and the origin between the ears. +ChietiItab: + value: ChietiItab + display_name: Chieti ITAB + description: | + RAS orientation and the origin between the ears. +individual: + value: individual + display_name: individual + description: | + Participant specific anatomical space (for example derived from T1w and/or T2w images). + This coordinate system requires specifying an additional, participant-specific file to be fully defined. + In context of surfaces this space has been referred to as `fsnative`. + + In order for this space to be interpretable, `SpatialReference` metadata MUST be provided. +study: + value: study + display_name: study + description: | + Custom space defined using a group/study-specific template. + This coordinate system requires specifying an additional file to be fully defined. + + In order for this space to be interpretable, `SpatialReference` metadata MUST be provided. +scanner: + value: scanner + display_name: scanner + description: | + The intrinsic coordinate system of the original image (the first entry of `RawSources`) + after reconstruction and conversion to NIfTI or equivalent for the case of surfaces and dual volume/surface + files. + + The `scanner` coordinate system is implicit and assumed by default if the derivative filename does not + define **any** `space-