Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Added the specification for using HED libraries in BIDS #1106

Merged
merged 18 commits into from
Aug 31, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 39 additions & 3 deletions src/99-appendices/03-hed.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,16 +167,16 @@ This allows for a proper validation of the HED annotations
(for example using the `bids-validator`).

Example: The following `dataset_description.json` file specifies that the
[`HED8.0.0.xml`](https://github.com/hed-standard/hed-specification/tree/master/hedxml/HED8.0.0.xml)
[`HED8.1.0.xml`](https://github.com/hed-standard/hed-specification/tree/master/hedxml/HED8.1.0.xml)
file from the `hedxml` directory of the
[`hed-specification`](https://github.com/hed-standard/hed-specification)
repository on GitHub should be used to validate the study event annotations.

```JSON
{
"Name": "A great experiment",
"BIDSVersion": "1.6.0",
"HEDVersion": "8.0.0"
"BIDSVersion": "1.7.0",
sappelhoff marked this conversation as resolved.
Show resolved Hide resolved
"HEDVersion": "8.1.0"
}
```

Expand All @@ -185,3 +185,39 @@ any present HED information will be validated using the latest version of the HE
which is bound to result in problems.
Hence, it is strongly RECOMMENDED that the `HEDVersion` field be included when using HED
in a BIDS dataset.

### Using HED library schemas

HED also allows you to use one or more specialized vocabularies along with
the base vocabulary. These specialized vocabularies are developed by
communities of users and are available in the GitHub
[hed-schema-library](https://github.com/hed-standard/hed-schema-library) repository.

Example: The following `dataset_description.json` file specifies that the
[`HED8.1.0.xml`](https://github.com/hed-standard/hed-specification/tree/master/hedxml/HED8.1.0.xml)
base schema should be used along with the
SCORE library for clinical neurological annotation and a test library
located at [HED_score_0.0.1.xml](https://github.com/hed-standard/hed-schema-library/blob/main/library_schemas/score/hedxml/HED_score_0.0.1.xml) and [HED_testlib_1.0.2.xml](https://github.com/hed-standard/hed-schema-library/blob/main/library_schemas/testlib/hedxml/HED_testlib_1.0.2.xml), respectively.

```JSON
{
"Name": "A great experiment",
"BIDSVersion": "1.7.0",
"HEDVersion": {
"base": "8.1.0",
"libraries": {
"sc": "score_0.0.1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this some "official" identity specification combining library and a version? if not, may be worth separating out, even if more verbose:

Suggested change
"sc": "score_0.0.1",
"sc": {"library": "score", "version": "0.0.1"},

or alike to make it explicit, and to allow in the future e.g. to expand with optional extra information (e.g. URL to the HED library which is not in the library of libraries i.e. hed-schema-library) but available from another library of libraries or may be even directly pointed to by that URL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some version of what you suggest was the original plan (though URIs were never planned to be supported other than bids: schema paths within the dataset), but it was simplified during development after feedback from the HED community. @VisLab can provide more detail, which is escaping me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion is above is one that we have considered (see hed-specification issue#156). The actual format is only proposed at this point in the HED specification, and we would appreciate opinions and are open to changes.

We have gone through a number of versions of this, and originally we thought we would support a file within the bids data set itself and arbitrary URLs. The HED working group consensus was to start with the standard libraries and see how it goes, since the purpose is to have standardized vocabularies to make meaningful comparisons across datasets.

On a related note:
The hed-python tools allow a schema group to be passed into its BIDS data set constructor to override the specification in dataset_description. Right now the hed-javascript public interface (which is what BIDS uses) only passes in the data set and constructs the schema group internally.

I am mentioning this because another possibility is to allow a schema group to be passed in as part of the public interface to the hed-javascript validator (from bids) which would override the internal specification extracted from the dataset_description. This is relevant because we are going to have to do a major version bump to support the libraries because of another interface change and we could add this in as an option and allow BIDS to decide. @sappelhoff. @rwblair?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification of which schemas are to be used to be used for validating a dataset are provided in a separate parameter, since hed-javascript cannot handle relative paths within the BIDS dataset directory and needs them parsed to use the absolute path. However, the actual Schema and Schemas types used by the validator to represent schemas are considered implementation details (despite being returned by the validator.buildSchema function) and not part of the stable API. In theory, the object passed as the second argument to validateBidsDataset can be any object conforming to the defined API, regardless of what datasetDescription.json says.

"ts": "testlib_1.0.2"
}
}
}
```
The `sc:` and `ts:` are user-chosen prefixes used to distinguish the sources
of the terms in the HED annotation. In the following HED annotation:
sappelhoff marked this conversation as resolved.
Show resolved Hide resolved

```Text
Data-feature, sc:Photmyogenic-response, sc:Wicket-spikes
```

The tag `Data-feature` is from HED8.1.0,
while `Photmyogenic-response` and `Wicket-spikes` are from HED_score_0.0.1.
10 changes: 9 additions & 1 deletion src/schema/objects/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1022,7 +1022,15 @@ HEDVersion:
description: |
If HED tags are used:
The version of the HED schema used to validate HED tags for study.
type: string
May include a single schema or a base schema and one or more library schema.
anyOf:
- type: string
- type: object
properties:
base:
type: string
libraries:
type: object
sappelhoff marked this conversation as resolved.
Show resolved Hide resolved
Haematocrit:
name: Haematocrit
description: |
Expand Down
4 changes: 2 additions & 2 deletions src/schema/rules/tabular_metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ scans:
- .json
entities:
subject: required
session: optional # session is required if session is present in the dataset.
sessions: # This file may only exist if session is present in the dataset.
session: optional # session is required if session is present in the dataset.
sessions: # This file may only exist if session is present in the dataset.
sappelhoff marked this conversation as resolved.
Show resolved Hide resolved
suffixes:
- sessions
extensions:
Expand Down