From 2fc9888d4bbf9b7766a544e1decb7189a3f2621e Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 3 Oct 2024 14:30:49 -0400 Subject: [PATCH 1/8] Move description of n/a above "String values" to avoid any association - It somewhat addresses concern/discussion in https://github.com/bids-standard/bids-specification/issues/1938 --- src/common-principles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/common-principles.md b/src/common-principles.md index 0a42e4a885..e885cdb8c4 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -489,8 +489,8 @@ first letter in lower case (for example, `variable_name`, not `Variable_name`). Column names defined in the header MUST be separated with tabs as for the data contents. Furthermore, column names MUST NOT be blank (that is, an empty string) and MUST NOT be duplicated within a single TSV file. -String values containing tabs MUST be escaped using double quotes. Missing and non-applicable values MUST be coded as `n/a`. +String values containing tabs MUST be escaped using double quotes. Numerical values MUST employ the dot (`.`) as decimal separator and MAY be specified in scientific notation, using `e` or `E` to separate the significand from the exponent. From add68fce7f0a22ac4f6dec192f105a1c1f05fe55 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Tue, 26 Nov 2024 14:55:42 -0500 Subject: [PATCH 2/8] Do explicitly "allow" for having dotfiles and explicitly ignore them MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Up to this point it was up for a validator to "code" that up. E.g. current new deno based validator in https://github.com/bids-standard/bids-validator/blob/main/src/files/ignore.ts has const defaultIgnores = [ '.git**', '.*', 'sourcedata/', ... thus ignores all dotfiles from validation. But this is inconsistent e.g. with shipped in bids-specification schematools treatment, where files in a dotdirectory were reported as "invalid": ❯ cat /tmp/input.txt | python -m bidsschematools pre-receive-hook .datalad/config .gitattributes sub-01/anat/sub-01_unknown.nii.gz And as demonstrated above there could be legit "system" dotfiles (VCS related etc) which are part of the folder containing BIDS dataset, I think we should **explicitly** describe position of BIDS in relation to the dotfiles. As it is pretty much "ignore", that is what this PR is intended to state. In this PR I also adjusted `pre_receive_hook` to ignore dotfiles. The `find_files` was already ignoring them. --- src/common-principles.md | 5 +++++ tools/schemacode/src/bidsschematools/__main__.py | 4 +++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/common-principles.md b/src/common-principles.md index 4e5509ac7b..454ecb2bad 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -202,6 +202,11 @@ as the labels would collide on a case-insensitive filesystem. Additionally, because the suffix `eeg` is defined, then the suffix `EEG` will not be added to future versions of the standard. +### Dotfiles + +Files and directories starting with a dot (`.`) are reserved for system use and no valid recognized BIDS file or directory can start with a `.`. +Any file or directory starting with a `.` present in a BIDS dataset is considered hidden and not subject to BIDS validation. + ## Uniqueness of data files Data files MUST be uniquely identified by BIDS path components diff --git a/tools/schemacode/src/bidsschematools/__main__.py b/tools/schemacode/src/bidsschematools/__main__.py index 777bea2d87..5f86cacad4 100644 --- a/tools/schemacode/src/bidsschematools/__main__.py +++ b/tools/schemacode/src/bidsschematools/__main__.py @@ -169,7 +169,9 @@ def pre_receive_hook(schema, input_, output): logger.debug("Validating files, first file: %s", filename) any_files = True filename = filename.strip() - if any(_bidsignore_check(pattern, filename, "") for pattern in ignore): + if filename.startswith(".") or any( + _bidsignore_check(pattern, filename, "") for pattern in ignore + ): continue if not any(re.match(regex, filename) for regex in regexes): print(filename, file=output) From c813a92a2630f80fbca7ce37282b1925bfe1e61f Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Thu, 18 Apr 2024 09:08:08 -0400 Subject: [PATCH 3/8] ENH: Add code to generate tables from TSV fence blocks --- mkdocs.yml | 7 ++++++- src/modality-agnostic-files.md | 4 ++-- .../src/bidsschematools/render/tsv.py | 19 +++++++++++++++++++ 3 files changed, 27 insertions(+), 3 deletions(-) create mode 100644 tools/schemacode/src/bidsschematools/render/tsv.py diff --git a/mkdocs.yml b/mkdocs.yml index d0738c7e29..ef36d6eb5c 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -104,7 +104,12 @@ extra_javascript: markdown_extensions: - toc: anchorlink: true - - pymdownx.superfences + - pymdownx.superfences: + preserve_tabs: true + custom_fences: + - name: tsv + class: tsv + format: !!python/name:bidsschematools.render.tsv.fence - admonition - pymdownx.details plugins: diff --git a/src/modality-agnostic-files.md b/src/modality-agnostic-files.md index b93ee17638..2d2e3941e2 100644 --- a/src/modality-agnostic-files.md +++ b/src/modality-agnostic-files.md @@ -486,7 +486,7 @@ All such included additional fields SHOULD be documented in an accompanying Example `_scans.tsv`: -```Text +```tsv filename acq_time func/sub-control01_task-nback_bold.nii.gz 1877-06-15T13:45:30 func/sub-control01_task-motor_bold.nii.gz 1877-06-15T13:55:33 @@ -522,7 +522,7 @@ and a guide for using macros can be found at `_sessions.tsv` example: -```Text +```tsv session_id acq_time systolic_blood_pressure ses-predrug 2009-06-15T13:45:30 120 ses-postdrug 2009-06-16T13:45:30 100 diff --git a/tools/schemacode/src/bidsschematools/render/tsv.py b/tools/schemacode/src/bidsschematools/render/tsv.py new file mode 100644 index 0000000000..28907a9f6a --- /dev/null +++ b/tools/schemacode/src/bidsschematools/render/tsv.py @@ -0,0 +1,19 @@ +import io + +import pandas as pd +from markdown_it import MarkdownIt +from tabulate import tabulate + + +def fence(source: str, language: str, css_class: str, options: dict, md, **kwargs) -> str: + try: + df = pd.read_csv(io.StringIO(source), sep="\t", dtype=str, keep_default_na=False) + md_table = tabulate(df, headers="keys", tablefmt="github", showindex=False) # type: ignore + html = MarkdownIt("commonmark").enable("table").render(md_table) + # Remove newlines from HTML to prevent copy-paste from inserting spaces + return html.replace("\n", "") + except Exception: + import traceback + + exc = traceback.format_exc() + return f"
{exc}
" From cb42cf14a4a2a9a4c90fbdcf9993ec5686c456b2 Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Thu, 18 Apr 2024 10:30:35 -0400 Subject: [PATCH 4/8] MNT: Only check yaml syntax --- .pre-commit-config.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 80671a10d3..053b76431c 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -10,6 +10,7 @@ repos: - id: trailing-whitespace - id: end-of-file-fixer - id: check-yaml + args: [--unsafe] - id: check-json - id: check-toml - id: check-ast From 61d10bba06555d9d4465bc73e090e886d9b74d0d Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Thu, 18 Apr 2024 10:31:19 -0400 Subject: [PATCH 5/8] Update all TSV blocks --- src/appendices/arterial-spin-labeling.md | 6 +-- src/appendices/hed.md | 8 ++-- src/derivatives/common-data-types.md | 11 +++--- src/longitudinal-and-multi-site-studies.md | 2 +- .../behavioral-experiments.md | 2 +- .../electroencephalography.md | 30 +++++++-------- .../genetic-descriptor.md | 2 +- .../intracranial-electroencephalography.md | 38 +++++++++---------- .../magnetoencephalography.md | 12 +++--- src/modality-specific-files/motion.md | 28 +++++++------- .../near-infrared-spectroscopy.md | 30 +++++++-------- .../physiological-recordings.md | 8 ++-- src/modality-specific-files/task-events.md | 2 +- 13 files changed, 90 insertions(+), 89 deletions(-) diff --git a/src/appendices/arterial-spin-labeling.md b/src/appendices/arterial-spin-labeling.md index 842bde8d80..ce182ca0d9 100644 --- a/src/appendices/arterial-spin-labeling.md +++ b/src/appendices/arterial-spin-labeling.md @@ -26,7 +26,7 @@ and the exact volume_type series should be specified in the `*_aslcontext.tsv`. Example of `*_aslcontext.tsv`: -```Text +```tsv volume_type control label @@ -44,7 +44,7 @@ In this case, the `deltam` should be included in the `*_asl.nii[.gz]` and specif Example of `*_aslcontext.tsv`: -```Text +```tsv volume_type deltam m0scan @@ -58,7 +58,7 @@ the `cbf` should be included in the `*_asl.nii[.gz]` and specified in the `*_asl Example of `*_aslcontext.tsv`: -```Text +```tsv volume_type cbf m0scan diff --git a/src/appendices/hed.md b/src/appendices/hed.md index 4582356ee5..830c132120 100644 --- a/src/appendices/hed.md +++ b/src/appendices/hed.md @@ -44,10 +44,10 @@ meanings in associated JSON sidecar files (`events.json`). (`trial_type`, `response_time`, and `stim_file`) in addition to the required `onset` and `duration` columns. -```Text -onset duration trial_type response_time stim_file -1.2 0.6 go 1.435 images/red_square.jpg -5.6 0.6 stop n/a images/blue_square.jpg +```tsv +onset duration trial_type response_time stim_file +1.2 0.6 go 1.435 images/red_square.jpg +5.6 0.6 stop n/a images/blue_square.jpg ``` The `trial_type` column in the above example contains a limited number of distinct diff --git a/src/derivatives/common-data-types.md b/src/derivatives/common-data-types.md index 63f0809c55..084fca8eb4 100644 --- a/src/derivatives/common-data-types.md +++ b/src/derivatives/common-data-types.md @@ -328,11 +328,12 @@ A guide for using macros can be found at Contents of the `descriptions.tsv` file: -| desc_id | description | -| ------- | ----------------------------------------------------------------------------------------------- | -| Filt | low-pass filtered at 30Hz | -| FiltDs | low-pass filtered at 30Hz, downsampled to 250Hz | -| preproc | low-pass filtered at 30Hz, downsampled to 250Hz, and rereferenced to a common average reference | +```tsv +desc_id description +Filt low-pass filtered at 30Hz +FiltDs low-pass filtered at 30Hz, downsampled to 250Hz +preproc low-pass filtered at 30Hz, downsampled to 250Hz, and rereferenced to a common average reference +``` diff --git a/src/longitudinal-and-multi-site-studies.md b/src/longitudinal-and-multi-site-studies.md index b64ae016e8..48960366ef 100644 --- a/src/longitudinal-and-multi-site-studies.md +++ b/src/longitudinal-and-multi-site-studies.md @@ -65,7 +65,7 @@ A guide for using macros can be found at `sub-control01_sessions.tsv` content: -```Text +```tsv session_id acq_time systolic_blood_pressure ses-predrug 2009-06-15T13:45:30 120 ses-postdrug 2009-06-16T13:45:30 100 diff --git a/src/modality-specific-files/behavioral-experiments.md b/src/modality-specific-files/behavioral-experiments.md index c059383d0e..cfb8264ce3 100644 --- a/src/modality-specific-files/behavioral-experiments.md +++ b/src/modality-specific-files/behavioral-experiments.md @@ -78,7 +78,7 @@ A guide for using macros can be found at ## Example `_beh.tsv` -```Text +```tsv trial response response_time stim_file congruent red 1.435 images/word-red_color-red.jpg incongruent red 1.739 images/word-red_color-blue.jpg diff --git a/src/modality-specific-files/electroencephalography.md b/src/modality-specific-files/electroencephalography.md index fb15cb7710..cb765c9549 100644 --- a/src/modality-specific-files/electroencephalography.md +++ b/src/modality-specific-files/electroencephalography.md @@ -274,12 +274,12 @@ Examples of free-form text for field `description` See also the corresponding [`electrodes.tsv` example](#example-_electrodestsv). -```Text -name type units description reference status status_description -VEOG VEOG uV left eye VEOG-, VEOG+ good n/a -FDI EMG uV left first dorsal interosseous FDI-, FDI+ good n/a -Cz EEG uV n/a REF bad high frequency noise -UADC001 MISC n/a envelope of audio signal n/a good n/a +```tsv +name type units description reference status status_description +VEOG VEOG uV left eye VEOG-, VEOG+ good n/a +FDI EMG uV left first dorsal interosseous FDI-, FDI+ good n/a +Cz EEG uV n/a REF bad high frequency noise +UADC001 MISC n/a envelope of audio signal n/a good n/a ``` ## Electrodes description (`*_electrodes.tsv`) @@ -318,15 +318,15 @@ If electrodes are repositioned, it is RECOMMENDED to use multiple sessions to in See also the corresponding [`channels.tsv` example](#example-_channelstsv). -```Text -name x y z type material -VEOG+ n/a n/a n/a cup Ag/AgCl -VEOG- n/a n/a n/a cup Ag/AgCl -FDI+ n/a n/a n/a cup Ag/AgCl -FDI- n/a n/a n/a cup Ag/AgCl -GND -0.0707 0.0000 -0.0707 clip-on Ag/AgCl -Cz 0.0000 0.0714 0.0699 cup Ag/AgCl -REF -0.0742 -0.0200 -0.0100 cup Ag/AgCl +```tsv +name x y z type material +VEOG+ n/a n/a n/a cup Ag/AgCl +VEOG- n/a n/a n/a cup Ag/AgCl +FDI+ n/a n/a n/a cup Ag/AgCl +FDI- n/a n/a n/a cup Ag/AgCl +GND -0.0707 0.0000 -0.0707 clip-on Ag/AgCl +Cz 0.0000 0.0714 0.0699 cup Ag/AgCl +REF -0.0742 -0.0200 -0.0100 cup Ag/AgCl ``` The [`acq-