diff --git a/.yamllint.yml b/.yamllint.yml index baed83012b..c18c17fe43 100644 --- a/.yamllint.yml +++ b/.yamllint.yml @@ -3,3 +3,6 @@ extends: default rules: line-length: max: 120 + indentation: + # See https://github.com/yaml/pyyaml/issues/545 for why + indent-sequences: false diff --git a/src/schema/README.md b/src/schema/README.md index 10a177ae0d..948d687a52 100644 --- a/src/schema/README.md +++ b/src/schema/README.md @@ -9,57 +9,144 @@ Any changes to the specification should be mirrored in the schema. ## The format of the schema -The schema reflects the files and objects in the specification, as well as -associations between these objects. Here is a list of the files and subfolders -of the schema, roughly in order of importance: - -- `datatypes/*.yaml`: Data types supported by the specification. Each datatype - may support many suffixes. These suffixes are divided into groups based on - what extensions and entities are allowed for each. Data types correspond to - subfolders (for example, `anat`, `func`) in the BIDS structure. - -- `entities.yaml`: A list of entities (key/value pairs in folder and - filenames) with associated descriptions and formatting rules. The order of - the entities in the file determines the order in which entities must appear - in filenames. - -- `top_level_files.yaml`: Modality-agnostic files stored at the top level of a - BIDS dataset. The schema specifies whether these files are required or - optional, as well as acceptable extensions for each. - -- `modalities.yaml`: Modalities supported by the specification, along with a - list of associated data types. Modalities are not reflected directly in the - BIDS structure, but data types are modality-specific. - -- `associated_data.yaml`: Folders that are commonly contained within the same - folder as a BIDS dataset, but which do not follow the BIDS structure - internally, such as `code` or `sourcedata`. The schema specifies which - folders are accepted and whether they are required or optional. - -- `suffixes/*.yaml`: Suffixes supported by the specification. - Each suffix schema file contains, at minimum, `name` and `description` - fields. Additionally, it may have a `unit` field defining possible - units for data associated with that suffix, as well as fields - defining the range or types of values which are allowed for the data, - such as `minValue` and `maxValue`. - -- `metadata/*.yaml`: Valid fields for sidecar metadata json files. - These files contain, at minimum, the following fields: `name`, - `description`, and a set of fields for describing the field's data type. - - The data types include `type`, which MUST have a value of - `array`, `string`, `integer`, `number`, `object`, or `boolean`. - There are additional fields which may define rules that apply to a given - type. - - - `array`: If `type` is `array`, then there MUST be an `items` field at - the same level as `type`. `items` describes the data type and rules that - apply to the individual items in the array. The same rules that apply to - describing data types for the field itself apply to `items`. - Additionally, there may be any of the following fields at the same level - as `type`: `minItems`, `maxItems`. - Here is an example of a field that MUST have three `integer` items: - ```yaml +The schema is divided into two parts: the object definitions and the rules. + +The object definitions (files in `objects/`) describe attributes of individual +objects or data types in the specification. +Common information in these files includes full names, descriptions, and +constraints on valid values. +These files **do not** describe how objects of different types +(for example file suffixes and file entities) interact with one another, or +whether objects are required in a given dataset or file. + +The rules (files in `rules/`) describe how objects related to one another, +as well as their requirement levels. + +## Object files + +The types of objects currently supported in the schema are: + +- modalities, +- datatypes, +- entities, +- suffixes, +- metadata, +- top-level files, +- and non-BIDS associated folders. + +Each of these object types has a single file in the `objects/` folder. + +- `modalities.yaml`: The modalities, or types of technology, used to acquire data in a BIDS dataset. + These modalities are not reflected directly in the specification. + For example, while both fMRI and DWI data are acquired with an MRI, + in a BIDS dataset they are stored in different folders reflecting the two different `datatypes`. + +- `datatypes.yaml`: Data types supported by the specification. + The only information provided in the file is: + + 1. a full list of valid BIDS datatypes + 1. each datatype's full name + 1. a free text description of the datatype. + +- `entities.yaml`: Entities (key/value pairs in folder and filenames). + +- `metadata.yaml`: All valid metadata fields that are explicitly supported in BIDS sidecar JSON files. + +- `suffixes.yaml`: Valid file suffixes. + +- `top_level_files.yaml`: Valid top-level files which may appear in a BIDS dataset. + +- `associated_data.yaml`: Folders that may appear within a dataset folder without following BIDS rules. + +### `modalities.yaml` + +This file contains a dictionary in which each modality is defined. +Keys are modality abbreviations (for example, `mri` for magnetic resonance imaging), +and each associated value is a dictionary with two keys: `name` and `description`. + +The `name` field is the full name of the modality. +The `description` field is a freeform description of the modality. + +### `datatypes.yaml` + +This file contains a dictionary in which each datatype is defined. +Keys are the folder names associated with each datatype (for example, `anat` for anatomical MRI), +and each associated value is a dictionary with two keys: `name` and `description`. + +The `name` field is the full name of the datatype. +The `description` field is a freeform description of the datatype. + +### `entities.yaml` + +This file contains a dictionary in which each entity (key/value pair in filenames) is defined. +Keys are long-form versions of the entities, which are distinct from both the entities as +they appear in filenames _and_ their full names. +For example, the key for the "Contrast Enhancing Agent" entity, which appears in filenames as `ce-