Skip to content

Commit

Permalink
[SCHEMA] Add TSV column files (#827)
Browse files Browse the repository at this point in the history
* Add template.

* Add first column.

* Add columns.

* More terms.

* Fill name field in all files.

* More work.

Note that there are three very different uses of "name" columns, and two of them are equally common, so I chose not to specify any of them as the "canonical" definition.

* Add remaining definitions.

* Add macro to render column tables.

* Fix YAML file.

* Consolidate suffixes file.

* Remove old individual files.

* Move columns file.

* Fix things up a bit.

* Add columns I missed for modality-agnostic TSV files.

* Support n/a for duration.

* Apply suggestions from code review

Co-authored-by: Chris Markiewicz <[email protected]>
Co-authored-by: Stefan Appelhoff <[email protected]>

* Code formatting in stim_file definition.

* Allow numbers and strings for value.

* Update src/schema/objects/columns.yaml

Co-authored-by: Stefan Appelhoff <[email protected]>

* Allow n/a for "z" column.

Addresses https://github.com/bids-standard/bids-specification/pull/827/files#r723280787.

* Describe meanings of x, y, and z columns.

Addresses https://github.com/bids-standard/bids-specification/pull/827/files#r723283314.

* Allow n/a for status column.

Addresses https://github.com/bids-standard/bids-specification/pull/827/files#r723269382.

* Add participant_id to participants.tsv table and append info for other IDs.

* Split type definitions into channels and electrodes versions.

* Update definitions for group based on file type.

* Split reference column definition.

* Clean up name_channels definition.

* Draft new columns from #816

* Add new columns to table.

* Remove list items.

* Update src/04-modality-specific-files/04-intracranial-electroencephalography.md

Co-authored-by: Stefan Appelhoff <[email protected]>

* Apply suggestions from code review

Co-authored-by: Chris Markiewicz <[email protected]>

* Use two underscores to delineate multiply-defined columns.

* Remove text that is now in table.

* Update src/schema/objects/columns.yaml

Co-authored-by: Chris Markiewicz <[email protected]>

* Add sections to README on columns file and on reused terms.

* Add EDF info to acq_time definition.

* Remove hardcoded tables.

* Remove unused links.

Co-authored-by: Chris Markiewicz <[email protected]>
Co-authored-by: Stefan Appelhoff <[email protected]>
  • Loading branch information
3 people authored Nov 9, 2021
1 parent 18e2057 commit 5400d6f
Show file tree
Hide file tree
Showing 13 changed files with 898 additions and 191 deletions.
109 changes: 35 additions & 74 deletions src/03-modality-agnostic-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,40 +179,21 @@ For backwards compatibility, if `species` is absent, the participant is assumed
`homo sapiens`.

Commonly used *optional* columns in `participants.tsv` files are `age`, `sex`,
`handedness`, `strain` and `strain_rrid`. We RECOMMEND to make use
`handedness`, `strain`, and `strain_rrid`. We RECOMMEND to make use
of these columns, and in case that you do use them, we RECOMMEND to use the
following values for them:

- `age`: numeric value in years (float or integer value)

- `sex`: string value indicating phenotypical sex, one of "male", "female",
"other"

- for "male", use one of these values: `male`, `m`, `M`, `MALE`, `Male`

- for "female", use one of these values: `female`, `f`, `F`, `FEMALE`,
`Female`

- for "other", use one of these values: `other`, `o`, `O`, `OTHER`,
`Other`

- `handedness`: string value indicating one of "left", "right",
"ambidextrous"

- for "left", use one of these values: `left`, `l`, `L`, `LEFT`, `Left`

- for "right", use one of these values: `right`, `r`, `R`, `RIGHT`,
`Right`

- for "ambidextrous", use one of these values: `ambidextrous`, `a`, `A`,
`AMBIDEXTROUS`, `Ambidextrous`

- `strain`: for species different from `homo sapiens`, string value indicating
the strain of the species, for example: `C57BL/6J`.

- `strain_rrid`: for species different from `homo sapiens`, research resource identifier
([RRID](https://scicrunch.org/resources/Organisms/search)) of the strain of the species,
for example: `RRID:IMSR_JAX:000664`.
{{ MACROS___make_columns_table(
{
"participant_id": ("REQUIRED", "There MUST be exactly one row for each participant."),
"species": "RECOMMENDED",
"age": "RECOMMENDED",
"sex": "RECOMMENDED",
"handedness": "RECOMMENDED",
"strain": "RECOMMENDED",
"strain_rrid": "RECOMMENDED",
}
) }}

Throughout BIDS you can indicate missing values with `n/a` (for "not
available").
Expand Down Expand Up @@ -279,32 +260,17 @@ samples.json

The purpose of this file is to describe properties of samples, indicated by the `sample` entity.
This file is REQUIRED if `sample-<label>` is present in any file name within the dataset.
If this file exists, it MUST contain the three following columns:

- `sample_id`: MUST consist of `sample-<label>` values identifying one row
for each sample

- `participant_id`: MUST consist of `sub-<label>`

- `sample_type`: MUST consist of sample type values, either `cell line`, `in vitro differentiated cells`,
`primary cell`, `cell-free sample`, `cloning host`, `tissue`, `whole organisms`, `organoid` or
`technical sample` from [ENCODE Biosample Type](https://www.encodeproject.org/profiles/biosample_type)

Other optional columns MAY be used to describe the samples.
Each sample MUST be described by one and only one row.

Commonly used *optional* columns in `samples.tsv` files are `pathology` and
`derived_from`. We RECOMMEND to make use of these columns, and in case that
you do use them, we RECOMMEND to use the following values for them:

- `pathology`: string value describing the pathology of the sample or type of control.
When different from `healthy`, pathology SHOULD be specified in `samples.tsv`.
The pathology MAY instead be specified in [Sessions files](./03-modality-agnostic-files.md#sessions-file)
in case it changes over time.

- `derived_from`: `sample-<label>` key/value pair from which a sample is derived from,
for example a slice of tissue (`sample-02`) derived from a block of tissue (`sample-01`),
as illustrated in the example below.
{{ MACROS___make_columns_table(
{
"sample_id": ("REQUIRED", "The combination of `sample_id` and `participant_id` MUST be unique."),
"participant_id": ("REQUIRED", "The combination of `sample_id` and `participant_id` MUST be unique."),
"sample_type": "REQUIRED",
"pathology": "RECOMMENDED",
"derived_from": "RECOMMENDED",
}
) }}

`samples.tsv` example:

Expand Down Expand Up @@ -432,25 +398,12 @@ Some recordings consist of multiple parts, that span several files,
for example through `echo-`, `part-`, or `split-` entities.
Such recordings MUST be documented with one row per file.

Relative paths to files should be used under a compulsory `filename` header.

If acquisition time is included it should be listed under the `acq_time` header.
Acquisition time refers to when the first data point in each run was acquired.
Furthermore, if this header is provided, the acquisition times of all files that
belong to a recording MUST be identical.

Datetime should be expressed as described in [Units](./02-common-principles.md#units).

For anonymization purposes all dates within one subject should be shifted by a
randomly chosen (but consistent across all recordings) number of days.
This way relative timing would be preserved, but chances of identifying a
person based on the date and time of their scan would be decreased.
Dates that are shifted for anonymization purposes SHOULD be set to the year 1925
or earlier to clearly distinguish them from unmodified data.
Note that some data formats do not support arbitrary recording dates.
For example, the [EDF](https://www.edfplus.info/)
data format can only contain recording dates after 1985.
Shifting dates is RECOMMENDED, but not required.
{{ MACROS___make_columns_table(
{
"filename": ("REQUIRED", "There MUST be exactly one row for each file."),
"acq_time": "OPTIONAL",
}
) }}

Additional fields can include external behavioral measures relevant to the
scan.
Expand Down Expand Up @@ -488,6 +441,14 @@ These files MUST include a `session_id` column and describe each session by one
Column names in `sessions.tsv` files MUST be different from group level participant key column names in the
[`participants.tsv` file](./03-modality-agnostic-files.md#participants-file).

{{ MACROS___make_columns_table(
{
"session_id": ("REQUIRED", "There MUST be exactly one row for each session."),
"acq_time": "OPTIONAL",
"pathology": "RECOMMENDED",
}
) }}

`_sessions.tsv` example:

```Text
Expand Down
34 changes: 19 additions & 15 deletions src/04-modality-specific-files/02-magnetoencephalography.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,24 +221,28 @@ The columns of the Channels description table stored in `*_channels.tsv` are:

MUST be present **in this specific order**:

| **Column name** | **Requirement level** | **Description** |
| --------------- | --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| name | REQUIRED | Channel name (for example, MRT012, MEG023). |
| type | REQUIRED | Type of channel; MUST use the channel types listed below. Note that the type MUST be in upper-case. |
| units | REQUIRED | Physical unit of the value represented in this channel, for example, `V` for Volt, or `fT/cm` for femto Tesla per centimeter (see [Units](../02-common-principles.md#units)). |
{{ MACROS___make_columns_table(
{
"name__channels": "REQUIRED",
"type__channels": "REQUIRED",
"units": "REQUIRED",
}
) }}

SHOULD be present:

| **Column name** | **Requirement level** | **Description** |
| ------------------ | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| description | OPTIONAL | Brief free-text description of the channel, or other information of interest. See examples below. |
| sampling_frequency | OPTIONAL | Sampling rate of the channel in Hz. |
| low_cutoff | OPTIONAL | Frequencies used for the high-pass filter applied to the channel in Hz. If no high-pass filter applied, use `n/a`. |
| high_cutoff | OPTIONAL | Frequencies used for the low-pass filter applied to the channel in Hz. If no low-pass filter applied, use `n/a`. Note that hardware anti-aliasing in A/D conversion of all MEG/EEG electronics applies a low-pass filter; specify its frequency here if applicable. |
| notch | OPTIONAL | Frequencies used for the notch filter applied to the channel, in Hz. If no notch filter applied, use `n/a`. |
| software_filters | OPTIONAL | List of temporal and/or spatial software filters applied (for example, "SSS", `"SpatialCompensation"`). Note that parameters should be defined in the general MEG sidecar .json file. Indicate `n/a` in the absence of software filters applied. |
| status | OPTIONAL | Data quality observed on the channel `(good/bad)`. A channel is considered `bad` if its data quality is compromised by excessive noise. Description of noise type SHOULD be provided in `[status_description]`. |
| status_description | OPTIONAL | Freeform text description of noise or artifact affecting data quality on the channel. It is meant to explain why the channel was declared bad in `[status]`. |
{{ MACROS___make_columns_table(
{
"description": "OPTIONAL",
"sampling_frequency": "OPTIONAL",
"low_cutoff": "OPTIONAL",
"high_cutoff": "OPTIONAL",
"notch": "OPTIONAL",
"software_filters": "OPTIONAL",
"status": "OPTIONAL",
"status_description": "OPTIONAL",
}
) }}

Example:

Expand Down
60 changes: 34 additions & 26 deletions src/04-modality-specific-files/03-electroencephalography.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,24 +216,28 @@ The columns of the Channels description table stored in `*_channels.tsv` are:

MUST be present **in this specific order**:

| **Column name** | **Requirement level** | **Description** |
| --------------- | --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| name | REQUIRED | Channel name (for example, FC1, Cz) |
| type | REQUIRED | Type of channel; MUST use the channel types listed below. Note that the type MUST be in upper-case. |
| units | REQUIRED | Physical unit of the value represented in this channel, for example, `V` for Volt, or `fT/cm` for femto Tesla per centimeter (see [Units](../02-common-principles.md#units)). |
{{ MACROS___make_columns_table(
{
"name__channels": "REQUIRED",
"type__channels": "REQUIRED",
"units": "REQUIRED",
}
) }}

SHOULD be present:

| **Column name** | **Requirement level** | **Description** |
| ------------------ | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| description | OPTIONAL | Free-form text description of the channel, or other information of interest. See examples below. |
| sampling_frequency | OPTIONAL | Sampling rate of the channel in Hz. |
| reference | OPTIONAL | Name of the reference electrode(s) (not needed when it is common to all channels, in that case it can be specified in `*_eeg.json` as `EEGReference`). |
| low_cutoff | OPTIONAL | Frequencies used for the high-pass filter applied to the channel in Hz. If no high-pass filter applied, use `n/a`. |
| high_cutoff | OPTIONAL | Frequencies used for the low-pass filter applied to the channel in Hz. If no low-pass filter applied, use `n/a`. Note that hardware anti-aliasing in A/D conversion of all EEG electronics applies a low-pass filter; specify its frequency here if applicable. |
| notch | OPTIONAL | Frequencies used for the notch filter applied to the channel, in Hz. If no notch filter applied, use `n/a`. |
| status | OPTIONAL | Data quality observed on the channel (`good`, `bad`). A channel is considered `bad` if its data quality is compromised by excessive noise. Description of noise type SHOULD be provided in `status_description`. |
| status_description | OPTIONAL | Free-form text description of noise or artifact affecting data quality on the channel. It is meant to explain why the channel was declared bad in `status`. |
{{ MACROS___make_columns_table(
{
"description": "OPTIONAL",
"sampling_frequency": "OPTIONAL",
"reference__eeg": "OPTIONAL",
"low_cutoff": "OPTIONAL",
"high_cutoff": "OPTIONAL",
"notch": "OPTIONAL",
"status": "OPTIONAL",
"status_description": "OPTIONAL",
}
) }}

Restricted keyword list for field `type` in alphabetic order (shared with the
MEG and iEEG modality; however, only the types that are common in EEG data are listed here).
Expand Down Expand Up @@ -288,20 +292,24 @@ file MUST be specified as well**. The order of the required columns in the

MUST be present **in this specific order**:

| **Column name** | **Requirement level** | **Description** |
| --------------- | --------------------- | ----------------------------------- |
| name | REQUIRED | Name of the electrode. |
| x | REQUIRED | Recorded position along the x-axis. |
| y | REQUIRED | Recorded position along the y-axis. |
| z | REQUIRED | Recorded position along the z-axis. |
{{ MACROS___make_columns_table(
{
"name__electrodes": "REQUIRED",
"x": "REQUIRED",
"y": "REQUIRED",
"z": "REQUIRED",
}
) }}

SHOULD be present:

| **Column name** | **Requirement level** | **Description** |
| --------------- | --------------------- | ---------------------------------------------------------------------- |
| type | RECOMMENDED | Type of the electrode (for example, cup, ring, clip-on, wire, needle). |
| material | RECOMMENDED | Material of the electrode (for example, Tin, Ag/AgCl, Gold). |
| impedance | RECOMMENDED | Impedance of the electrode, units MUST be in `kOhm`. |
{{ MACROS___make_columns_table(
{
"type__electrodes": "RECOMMENDED",
"material": "RECOMMENDED",
"impedance": "RECOMMENDED",
}
) }}

### Example `electrodes.tsv`

Expand Down
Loading

0 comments on commit 5400d6f

Please sign in to comment.