diff --git a/Release_Protocol.md b/Release_Protocol.md index f3163cf403..c93c039a48 100644 --- a/Release_Protocol.md +++ b/Release_Protocol.md @@ -278,5 +278,5 @@ Update the following files in the BIDS website repository (https://github.com/bi ### 12. Sharing news of the release -Please share news of the release on the [identified platforms](https://docs.google.com/spreadsheets/d/16SAGK3zG93WM2EWuoZDcRIC7ygPc5b7PDNGpFyC3obA/edit#gid=0). +Please share news of the release on the [identified platforms](https://github.com/bids-standard/bids-specification?tab=readme-ov-file#BIDS-communication-channels). Please use our previous release posts as a guide. diff --git a/pdf_build_src/remove_admonitions.py b/pdf_build_src/remove_admonitions.py index ddd2b636a1..1942f8b5b0 100644 --- a/pdf_build_src/remove_admonitions.py +++ b/pdf_build_src/remove_admonitions.py @@ -47,7 +47,7 @@ def remove_admonitions( counter += 1 continue - if not line.startswith(indent): + if line != "\n" and not line.startswith(indent): is_admonition = False if is_admonition: diff --git a/pdf_build_src/tests/data/expected/README.md b/pdf_build_src/tests/data/expected/README.md index e719b45f38..37fadc295d 100644 --- a/pdf_build_src/tests/data/expected/README.md +++ b/pdf_build_src/tests/data/expected/README.md @@ -19,3 +19,14 @@ Collapsible admonitions start with 3 questions marks (`???`). Collapsible admonitions that will be shown as expanded start with 3 questions marks and a plus sign (`???+`). + + + +Let's see + +- [`UK biobank`](https://github.com/bids-standard/bids-examples/tree/master/genetics_ukbb) +- foo bar [`UK biobank`](https://github.com/bids-standard/bids-examples/tree/master/genetics_ukbb) + +More of the admonition + +And here we resume normal thing. diff --git a/pdf_build_src/tests/data/input/README.md b/pdf_build_src/tests/data/input/README.md index 102f731fc6..cab4817282 100644 --- a/pdf_build_src/tests/data/input/README.md +++ b/pdf_build_src/tests/data/input/README.md @@ -25,3 +25,16 @@ come in different type. In aaddtion of the classical admonitions show above you Collapsible admonitions that will be shown as expanded start with 3 questions marks and a plus sign (`???+`). + + + +!!! example "non ordered list should be handle propeler" + + Let's see + + - [`UK biobank`](https://github.com/bids-standard/bids-examples/tree/master/genetics_ukbb) + - foo bar [`UK biobank`](https://github.com/bids-standard/bids-examples/tree/master/genetics_ukbb) + + More of the admonition + +And here we resume normal thing. diff --git a/src/common-principles.md b/src/common-principles.md index f56f420c08..461842ddfc 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -238,10 +238,14 @@ distinguish partial results from the raw data and share the latter. See [Storage of derived datasets](#storage-of-derived-datasets) for more on organizing derivatives. -Similar rules apply to source data, which is defined as data before -harmonization, reconstruction, and/or file format conversion (for example, E-Prime event logs or DICOM files). -Storing actual source files with the data is preferred over links to -external source repositories to maximize long term preservation, +Similar rules apply to source data, which is defined as data +before harmonization, reconstruction, and/or file format conversion +(for example, E-Prime event logs or DICOM files). +Retaining the source data is especially valuable +in a case when conversion fails to preserve crucial metadata +unique to specific acquisition setup. +Storing actual source files with the data is preferred over links +to external source repositories to maximize long term preservation, which would suffer if an external repository would not be available anymore. This specification currently does not go into the details of recommending a particular naming scheme for including different types of @@ -426,36 +430,54 @@ NIfTI header. ### Tabular files -Tabular data MUST be saved as tab delimited values (`.tsv`) files, that is, CSV -files where commas are replaced by tabs. Tabs MUST be true tab characters and -MUST NOT be a series of space characters. Each TSV file MUST start with a header -line listing the names of all columns (with the exception of -[physiological and other continuous recordings](modality-specific-files/physiological-and-other-continuous-recordings.md) -as well as [motion recording data](modality-specific-files/motion.md)). +Tabular data MUST be saved as plain-text, tab-delimited values (TSV) files +(with [extension `.tsv`](glossary.md#tsv-extensions)), +that is, [CSV files](https://en.wikipedia.org/wiki/Comma-separated_values) where commas are replaced by tab characters. +Tabs MUST be true tab characters and MUST NOT be a series of space characters. +Tabular data such as continuous physiology recordings typically containing +large numbers of rows MAY be saved as +[compressed tabular files (with extension `.tsv.gz`)](#compressed-tabular-files), +which are introduced below. +Plain-text TSV and compressed TSV are not interchangeable, that is, each section +of the specification prescribes which one MUST be used for the data type at +hand. +Each TSV file MUST start with a header line listing the names of all columns +with two exceptions: + +1. [compressed tabular files](#compressed-tabular-files), + for which column names are defined in a sidecar metadata + [JSON object](https://www.json.org/json-en.html) described below; and +1. [motion recording data](modality-specific-files/motion.md), + which use plain-text TSV and columns are defined as described + in its corresponding section of the specifications. + It is RECOMMENDED that the column names in the header of the TSV file are written in [`snake_case`](https://en.wikipedia.org/wiki/Snake_case) with the first letter in lower case (for example, `variable_name`, not `Variable_name`). -As for all other data in the TSV files, column names MUST be separated with tabs. +Column names defined in the header MUST be separated with tabs as for the data contents. Furthermore, column names MUST NOT be blank (that is, an empty string) and MUST NOT be duplicated within a single TSV file. -String values containing tabs MUST be escaped using double -quotes. Missing and non-applicable values MUST be coded as `n/a`. Numerical -values MUST employ the dot (`.`) as decimal separator and MAY be specified +String values containing tabs MUST be escaped using double quotes. +Missing and non-applicable values MUST be coded as `n/a`. +Numerical values MUST employ the dot (`.`) as decimal separator and MAY be specified in scientific notation, using `e` or `E` to separate the significand from the -exponent. TSV files MUST be in UTF-8 encoding. +exponent. +TSV files MUST be in UTF-8 encoding. Example: ```Text -onset duration response_time correct stop_trial go_trial -200 200 0 n/a n/a n/a +onset duration response_time trial_type trial_extra +200 20.0 15.8 word 中国人 +240 5.0 17.34e-1 visual n/a ``` -**Note**: The TSV examples in this document (like the one above this note) -are occasionally formatted using space characters instead of tabs to improve -human readability. -Directly copying and then pasting these examples from the specification -for use in new BIDS datasets can lead to errors and is discouraged. +!!! warning "Attention" + + The TSV examples in this document (like the one above this note) are occasionally + formatted using space characters instead of tabs to improve human readability. + Directly copying and then pasting these examples from the specification + for use in new BIDS datasets can lead to errors and is discouraged. Tabular files MAY be optionally accompanied by a simple data dictionary in the form of a JSON [object](https://www.json.org/json-en.html) @@ -532,12 +554,38 @@ like in the example below. "F": { "Description": "Female", "TermURL": "https://www.ncbi.nlm.nih.gov/mesh/68005260" - }, + } } } } ``` +### Compressed tabular files + +Large tabular information, such as physiological recordings, MUST be stored with +[compressed tab-delineated (TSV.GZ) files](glossary.md#tsvgz-extensions) when +so established by the specifications. +Rules for formatting plain-text tabular files apply to TSVGZ files with three exceptions: + +1. The contents of TSVGZ files MUST be compressed with + [gzip](https://datatracker.ietf.org/doc/html/rfc1952). +1. Compressed tabular files MUST NOT contain a header in the first row + indicating the column names. +1. TSVGZ files MUST have an associated JSON file that defines the columns in the tabular file. + +!!! warning "Attention" + + In contrast to plain-text TSV files, + compressed tabular files files MUST NOT include a header line. + Column names MUST be provided in the JSON file with the + [`Columns`](glossary.md#columns-metadata) field. + Each column MAY additionally be described with a column description, + as described in [Tabular files](#tabular-files). + + TSVGZ are header-less to improve compatibility with existing software + (for example, FSL, or PNM), and to facilitate the support for other file formats + in the future. + ### Key-value files (dictionaries) JavaScript Object Notation (JSON) files MUST be used for storing key-value diff --git a/src/modality-specific-files/electroencephalography.md b/src/modality-specific-files/electroencephalography.md index 62d215e038..6ba02cc13a 100644 --- a/src/modality-specific-files/electroencephalography.md +++ b/src/modality-specific-files/electroencephalography.md @@ -42,14 +42,10 @@ It is RECOMMENDED to use the European data format, or the BrainVision data format. It is furthermore discouraged to use the other accepted formats over these RECOMMENDED formats, particularly because there are conversion scripts available in most commonly used programming languages to convert data into the -RECOMMENDED formats. The data in their original format, if different from the -supported formats, can be stored in the [`/sourcedata` directory](../common-principles.md#source-vs-raw-vs-derived-data). - -The original data format is especially valuable in case conversion elicits the -loss of crucial metadata specific to manufacturers and specific EEG systems. We -also encourage users to provide additional meta information extracted from the -manufacturer specific data files in the sidecar JSON file. Other relevant files -MAY be included alongside the original EEG data in `/sourcedata`. +RECOMMENDED formats. + +We encourage users to provide additional metadata extracted from the +manufacturer-specific data files in the sidecar JSON file. Note the `RecordingType`, which depends on whether the data stream on disk is interrupted or not. diff --git a/src/modality-specific-files/intracranial-electroencephalography.md b/src/modality-specific-files/intracranial-electroencephalography.md index 02fa4a189a..ae6a669a84 100644 --- a/src/modality-specific-files/intracranial-electroencephalography.md +++ b/src/modality-specific-files/intracranial-electroencephalography.md @@ -53,12 +53,8 @@ packages. Other formats that may be considered in the future should have a clear added advantage over the existing formats and should have wide adoption in the BIDS community. -The data format in which the data was originally stored is especially valuable -in case conversion elicits the loss of crucial metadata specific to -manufacturers and specific iEEG systems. We also encourage users to provide -additional meta information extracted from the manufacturer-specific data files -in the sidecar JSON file. Other relevant files MAY be included alongside the -original iEEG data in the [`/sourcedata` directory](../common-principles.md#source-vs-raw-vs-derived-data). +We encourage users to provide additional metadata extracted from the +manufacturer-specific data files in the sidecar JSON file. Note the RecordingType, which depends on whether the data stream on disk is interrupted or not. Continuous data is by definition 1 segment without interruption. diff --git a/src/modality-specific-files/microscopy.md b/src/modality-specific-files/microscopy.md index 2797781fa6..1b813f9eb2 100644 --- a/src/modality-specific-files/microscopy.md +++ b/src/modality-specific-files/microscopy.md @@ -59,9 +59,6 @@ Microscopy raw data MUST be stored in one of the following formats: - [OME-ZARR/NGFF](https://ngff.openmicroscopy.org/latest/) (`.ome.zarr` directories) -If different from PNG, TIFF, OME-TIFF, or OME-ZARR, the original unprocessed data in the native format MAY be -stored in the [`/sourcedata` directory](../common-principles.md#source-vs-raw-vs-derived-data). - ### Modality suffixes Microscopy data currently support the following imaging modalities: diff --git a/src/modality-specific-files/motion.md b/src/modality-specific-files/motion.md index 38ff4449cd..f0bbcb96ef 100644 --- a/src/modality-specific-files/motion.md +++ b/src/modality-specific-files/motion.md @@ -50,18 +50,14 @@ The number of columns in `_motion.tsv` files MUST equal the number of rows in the associated `_channels.tsv` file. All relevant metadata about a tracking systems is stored in accompanying sidecar `*_tracksys-