Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix up examples in src/schema/README.md to not use outdated schema paths #1698

Merged
merged 10 commits into from
Mar 13, 2024
3 changes: 3 additions & 0 deletions readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,14 @@ version: 2

build:
os: ubuntu-22.04
apt_packages:
- jq
tools:
python: "3.11"
jobs:
pre_build:
- bst -v export --output src/schema.json
- tools/no-bad-schema-paths.sh src/schema.json # README.md might need fixing

mkdocs:
configuration: mkdocs.yml
Expand Down
37 changes: 21 additions & 16 deletions src/schema/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,18 +136,20 @@ with the object being referenced.
The following two prototypical examples are presented to clarify the semantics of
references (the cases in which they are used will be presented later):

1. In `objects.metadata`:
1. In `objects.enums`:
```YAML
_GeneticLevelEnum:
type: string
enum:
- Genetic
- Genomic
- Epigenomic
- Transcriptomic
- Metabolomic
- Proteomic

- $ref: objects.enums.Genetic.value
- $ref: objects.enums.Genomic.value
- $ref: objects.enums.Epigenomic.value
- $ref: objects.enums.Transcriptomic.value
- $ref: objects.enums.Metabolomic.value
- $ref: objects.enums.Proteomic.value
```
and in `objects.metadata`:
```YAML
GeneticLevel:
name: GeneticLevel
display_name: Genetic Level
Expand All @@ -156,29 +158,29 @@ references (the cases in which they are used will be presented later):
Values MUST be one of `"Genetic"`, `"Genomic"`, `"Epigenomic"`,
`"Transcriptomic"`, `"Metabolomic"`, or `"Proteomic"`.
anyOf:
- $ref: objects.metadata._GeneticLevelEnum
- $ref: objects.enums._GeneticLevelEnum
- type: array
items:
$ref: objects.metadata._GeneticLevelEnum
$ref: objects.enums._GeneticLevelEnum
```
Here `_GeneticLevelEnum` is used to describe the valid values of `GeneticLevel`,
and the references inside `GeneticLevel.anyOf` indicate that there may be a single
(which are in turn references to individual values), and the references inside `GeneticLevel.anyOf` indicate that there may be a single
such value or a list of values.

1. In `rules.datatypes.derivatives.common_derivatives`:
1. In [`rules.files.deriv.preprocessed_data`](./rules/files/deriv/preprocessed_data.yaml):
```YAML
anat_nonparametric_common:
$ref: rules.datatypes.anat.nonparametric
$ref: rules.files.raw.anat.nonparametric
entities:
$ref: rules.datatypes.anat.nonparametric.entities
$ref: rules.files.raw.anat.nonparametric.entities
space: optional
description: optional
```
Here, the derivative datatype rule starts by copying the raw datatype rule
`rules.datatypes.anat.nonparametric`.
`rules.files.raw.anat.nonparametric`.
It then *overrides* the `entities` portion of that rule with a new object.
To *extend* the original `entities`, it again begins
by referencing `rules.datatypes.anat.nonparametric.entities`,
by referencing `rules.files.raw.anat.nonparametric.entities`,
and adding the new entities `space` and `description`.

### Expressions
Expand Down Expand Up @@ -229,7 +231,10 @@ which (currently) contains at the top level:
- `associations`: associated files, discovered by the inheritance principle
- `columns`: the columns in the current TSV file
- `json`: the contents of the current JSON file
- `gzip`: the contents of the current file GZIP header
- `nifti_header`: selected contents of the current NIfTI file's header
- `ome`: the contents of the current OME-XML metadata
- `tiff`: the contents of the current TIFF file's header

Some of these are strings, while others are nested objects.
These are to be populated by an *interpreter* of the schema,
Expand Down
29 changes: 29 additions & 0 deletions tools/no-bad-schema-paths.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/bash

set -eu -o pipefail

schema_json=$(readlink -f "$1")

cd "$(dirname "$(readlink -f "$0")")/../src/schema"

# Create a temporary file and ensure it gets deleted on exit
tmpfile=$(mktemp)
trap 'rm -f "$tmpfile"' EXIT

grep -oE '(://)?([-_A-Za-z]+\.)+[-_A-Za-z]+' README.md \
| grep -v -e :// -e '\.\(md\|html\|json\|tsv\|yaml\)$' \
| grep -e '^\(meta\|objects\|rules\)' \
| grep -v 'objects.metadata.OtherObjectName' \
| sort | uniq | \
while IFS= read -r p; do
v=$(jq ".$p" < "$schema_json" | grep -v '^null$' || echo "fail")
if [ -z "$v" ] || [ "$v" = "fail" ]; then
echo "$p: not reachable" >> "$tmpfile"
fi
done

# Check if the temporary file is empty
if [ -s "$tmpfile" ]; then
cat "$tmpfile" # Display the not reachable paths
exit 1
fi