bids-standard · Remi-Gau · May 6, 2024 · May 7, 2024 · May 7, 2024 · May 7, 2024
@@ -234,15 +234,13 @@ In some cases, this principle is enforced in the BIDS validator.
 ## Source vs. raw vs. derived data
 
 BIDS was originally designed to describe and apply consistent naming conventions
-to raw (unprocessed or minimally processed due to file format conversion) data.
+to [raw datasets](./glossary.md#raw-common_principles) (unprocessed or minimally processed due to file format conversion).
 During analysis such data will be transformed and partial as well as final results
 will be saved.
-Derivatives of the raw data (other than products of DICOM to NIfTI conversion)
-MUST be kept separate from the raw data. This way one can protect the raw data
-from accidental changes by file permissions. In addition it is easy to
-distinguish partial results from the raw data and share the latter.
-See [Storage of derived datasets](#storage-of-derived-datasets) for more on
-organizing derivatives.
+[Derivatives](./glossary.md#derivative-common_principles) of the raw data MUST be kept separate from the raw data.
+This way one can protect the raw data from accidental changes by file permissions.
+In addition it is easy to distinguish partial results from the raw data and share the latter.
+See [Storage of derived datasets](#storage-of-derived-datasets) for more on organizing derivatives.
 
 Similar rules apply to source data, which is defined as data
 before harmonization, reconstruction, and/or file format conversion
@@ -340,12 +338,10 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to:
 Derivatives can be stored/distributed in two ways:
 
 1.  Under a `derivatives/` subdirectory in the root of the source BIDS dataset
-    directory to make a clear distinction between raw data and results of data
-    processing.
+    directory to make a clear distinction between raw data and results of data processing.
     A data processing pipeline will typically have a dedicated directory
     under which it stores all of its outputs.
-    Different components of a pipeline can, however, also be stored under different
-    subdirectories.
+    Different components of a pipeline can, however, also be stored under different subdirectories.
     There are few restrictions on the directory names;
     it is RECOMMENDED to use the format `<pipeline>-<variant>` in cases where
     it is anticipated that the same pipeline will output more than one variant
@@ -377,11 +373,10 @@ Derivatives can be stored/distributed in two ways:
     <dataset>/derivatives/spm-preproc/derivatives/spm-stats/sub-0001
     ```
 
-1.  As a standalone dataset independent of the source (raw or derived) BIDS
-    dataset.
-    This way of specifying derivatives is particularly useful when the source
-    dataset is provided with read-only access, for publishing derivatives as
-    independent bodies of work, or for describing derivatives that were created
+1.  As a standalone dataset independent of the source (raw or derived) BIDS dataset.
+    This way of specifying derivatives is particularly useful when the source dataset
+    is provided with read-only access, for publishing derivatives as independent bodies of work,
+    or for describing derivatives that were created
     from more than one source dataset.
     The `sourcedata/` subdirectory MAY be used to include the source dataset(s)
     that were used to generate the derivatives.

diff --git a/src/derivatives/introduction.md b/src/derivatives/introduction.md
@@ -1,30 +1,32 @@
 # BIDS Derivatives
 
-Derivatives are outputs of common processing pipelines, capturing data and
-meta-data sufficient for a researcher to understand and (critically) reuse those
-outputs in subsequent processing.
+[Derivatives datasets](../glossary.md#derivative-common_principles) are outputs of common processing pipelines,
+capturing data and meta-data sufficient for a researcher
+to understand and (critically) reuse those outputs in subsequent processing.
 Standardizing derivatives is motivated by use cases where formalized
 machine-readable access to processed data enables higher-level processing.
 
-The following sections cover additions to and divergences from "raw" BIDS.
-Raw data are data that have been curated into BIDS from a non-BIDS source.
-If a dataset is derived from at least one other valid BIDS dataset, then it is a derivative dataset.
+The following sections cover additions to and divergences from [raw BIDS datasets](../glossary.md#raw-common_principles).
 
-Examples:
+[Raw BIDS datasets](../glossary.md#raw-common_principles) are data that have been curated into BIDS from one or more non-BIDS sources.
+If a dataset is derived from at least one other valid BIDS dataset,
+then it is a [derivatives datasets](../glossary.md#derivative-common_principles).
 
-A defaced T1w image would typically be made during the curation process and is thus under raw
+!!! example
 
-```Text
-sourcedata/private/sub-01/anat/sub-01_T1w.nii.gz
-sub-01/anat/sub-01_T1w.nii.gz
-```
+    A defaced T1w image would typically be made during the curation process and is thus under raw
 
-A defaced T1w image could also, in theory, be derived from a BIDS dataset and would thus be under derivatives
+    ```Text
+    sourcedata/private/sub-01/anat/sub-01_T1w.nii.gz
+    sub-01/anat/sub-01_T1w.nii.gz
+    ```
 
-```Text
-sub-01/anat/sub-01_T1w.nii.gz
-derivatives/sub-01/anat/sub-01_desc-defaced_T1w.nii.gz
-```
+    A defaced T1w image could also, in theory, be derived from a BIDS dataset and would thus be under derivatives
+
+    ```Text
+    sub-01/anat/sub-01_T1w.nii.gz
+    derivatives/sub-01/anat/sub-01_desc-defaced_T1w.nii.gz
+    ```
 
 ## Derivatives storage and directory structure
 

@@ -46,6 +46,9 @@ dataset:
   description: |
     A set of neuroimaging and behavioral data acquired for a purpose of a particular study.
     A dataset consists of data acquired from one or more subjects, possibly from multiple sessions.
+derivative:
+  display_name: derivative dataset
+  description: If a dataset is derived from at least one other valid BIDS dataset, then it is a derivative dataset.
-  description: If a dataset is derived from at least one other valid BIDS dataset, then it is a derivative dataset.
+  description: A BIDS dataset derived from at least one other BIDS dataset.
-  description: If a dataset is derived from at least one other valid BIDS dataset, then it is a derivative dataset.
+  description: A BIDS dataset derived from at least one other BIDS dataset.
 deprecated:
   display_name: DEPRECATED
   description: |
@@ -97,6 +100,9 @@ modality:
     the technique is sufficiently uniform to define the modalities `eeg`, `meg` and `ieeg`.
     When applicable, the modality is indicated in the **suffix**.
     The modality may overlap with, but should not be confused with the **data type**.
+raw:
+  display_name: raw dataset
+  description: A raw BIDS dataset is data that have been curated into BIDS from a non-BIDS source.
-  description: A raw BIDS dataset is data that have been curated into BIDS from a non-BIDS source.
+  description: A BIDS dataset that have been curated into BIDS from a non-BIDS source(s), for example data from acquisition hardware.
-  description: A raw BIDS dataset is data that have been curated into BIDS from a non-BIDS source.
+  description: A BIDS dataset that have been curated into BIDS from a non-BIDS source(s), for example data from acquisition hardware.
 run:
   display_name: Run
   description: |

@@ -16,3 +16,5 @@
 - suffix
 - extension
 - deprecated
+- raw
+- derivative