-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ENH] Update Nipoppy docs and add page about trackers (#164)
* update paths to scripts * add note to clarify terms "session" and "session ID" * attempt to fix nested list * add page about trackers * remove old tracking section from MRIQC page * minor changes * try to fix nested list rendering (again) * add notes about `run_dicom_org.py` optional parameters * fix/udpate MRIQC page sample command * make link to digest a clickable link * address Nikhil comments * add links to manifest and doughnut descriptions * add glossary * reorder glossary and update "`session_id` vs `visit_id`" * add recommendation that visits should be a timeline * fix French spelling... * add updates after speaking with Nikhil * add mention of "subject ID" in `participant_id` entry --------- Co-authored-by: Nikhil Bhagwat <[email protected]>
- Loading branch information
1 parent
6a46e21
commit dfffd7b
Showing
16 changed files
with
171 additions
and
74 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
## Glossary | ||
|
||
This page lists some definitions for important/recurring terms used in the Nipoppy framework. | ||
|
||
### `participant_id` | ||
|
||
**Appears in**: `manifest.csv`, `doughnut.csv` | ||
|
||
: Unique identifier for the participant (i.e., subject ID), as provided by the study. | ||
|
||
### `datatype` | ||
|
||
**Appears in**: `manifest.csv` | ||
|
||
: A BIDS-compliant "data type" value (see the [BIDS specification website](https://bids-specification.readthedocs.io/en/stable/common-principles.html#definitions) for a comprehensive list). The most common data types for magnetic resonance imaging (MRI) data are `"anat"`, `"func"`, and `"dwi"`. | ||
|
||
### `visit` | ||
|
||
**Appears in**: `manifest.csv` | ||
|
||
: An identifier for a data collection event, not restricted to imaging data. | ||
|
||
See also: [`session` vs `visit`](#session-vs-visit) | ||
|
||
### `session` | ||
|
||
**Appears in**: `manifest.csv`, `doughnut.csv` | ||
|
||
: A BIDS-compliant session identifier. Consists of the `"ses-"` prefix followed by the [`session_id`](#session_id). | ||
|
||
#### [`session`](#session) vs [`visit`](#visit) | ||
|
||
Nipoppy uses `session` for imaging data, following the convention established by BIDS. The term `visit`, on the other hand, is used to refer to any data collection event (not necessarily imaging-related). In most cases, `session` and `visit` will be identical (or `session`s will be a subset of `visit`s). However, having two descriptors becomes particularly useful when imaging and non-imaging assessments do not use the same naming conventions. | ||
|
||
### `participant_dicom_dir` | ||
|
||
**Appears in**: `doughnut.csv` | ||
|
||
: The name of the directory in which the raw DICOM data (before the DICOM organization step) are found. Usually, this is the same as [`participant_id`](#participant_id), but depending on the study it could be different. | ||
|
||
### `dicom_id` | ||
|
||
**Appears in**: `doughnut.csv` | ||
|
||
: The [`participant_id`](#participant_id), stripped of any non-alphanumerical character. For studies that do not use non-alphanumerical characters in their participant IDs, this is exactly the same as [`participant_id`](#participant_id). | ||
|
||
### `bids_id` | ||
|
||
**Appears in**: `doughnut.csv` | ||
|
||
: A BIDS-compliant participant identifier. Obtained by adding the `"sub-"` prefix to the [`dicom_id`](#dicom_id), which itself is derived from the [`participant_id`](#participant_id). A participant's raw BIDS data and derived imaging data are stored in directories named after their `bids_id`. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
## Track data availability status | ||
|
||
--- | ||
|
||
Trackers check the availability of files created during the dataset processing workflow (specifically the BIDS raw data and imaging pipeline derivatives) and assign an availability status (`SUCCESS`, `FAIL`, `INCOMPLETE` or `UNAVAILABLE`). | ||
|
||
--- | ||
|
||
### Key directories and files | ||
|
||
- `<DATASET_ROOT>/bids` | ||
- `<DATASET_ROOT>/derivatives` | ||
- `<DATASET_ROOT>/derivatives/bagel.csv` | ||
|
||
### Running the tracker script | ||
|
||
The tracker uses the [`manifest.csv`](./configs.md#participant-manifest-manifestcsv) and [`doughnut.csv`](./workflow/dicom_org.md#procedure) files to determine the participant-session pairs to check. Each available tracker has an associated configuration file (typically called `<pipeline>_tracker.py`), where lists of expected paths for files produced by the pipeline are defined. | ||
|
||
For each participant-session pair being tracked, the tracker outputs a `"pipeline_complete"` status. Depending on the configuration for that particular pipeline, the tracker might also output phase and/or stage statuses (e.g., `"PHASE__func"`), which typically refer to sub-pipelines within the full pipeline that may or may not have been run during processing, depending on the input data and/or processing parameters. | ||
|
||
The tracker script updates the tabular `<DATASET_ROOT>/derivatives/bagel.csv` file (see the [Understanding the `bagel.csv` output](#understanding-the-bagelcsv-output) for more information). | ||
|
||
> Sample command: | ||
```bash | ||
python nipoppy/trackers/run_tracker.py \ | ||
--global_config <global_config_file> | ||
--dash_schema nipoppy/trackers/bagel_schema.json | ||
--pipelines fmriprep mriqc tractoflow heudiconv | ||
``` | ||
|
||
Notes: | ||
- Currently available image processing pipelines are: `fmriprep`, `mriqc`, and `tractoflow`. See [Adding a tracker](#adding-a-tracker) for the steps to add a new tracker. | ||
- Use `--pipelines heudiconv` for tracking BIDS data availability | ||
- An optional `--session_id` parameter can be specified to only track a specific session. By default, the trackers are run for all sessions. | ||
- Other optional arguments include `--run_id` and `--acq_label`, to help generate expected file paths for BIDS Apps. | ||
|
||
### Understanding the `bagel.csv` output | ||
|
||
A JSON schema for the `bagel.csv` file produced by the tracker script is available [here](https://github.com/neurobagel/digest/blob/main/schemas/bagel_schema.json). | ||
|
||
Here is an example of a `bagel.csv` file: | ||
|
||
| bids_id | participant_id | session | has_mri_data | pipeline_name | pipeline_version | pipeline_starttime | pipeline_complete | | ||
| ------- | -------------- | ------- | ------------ | ------------- | ---------------- | ------------------ | ----------------- | | ||
| sub-MNI001 | MNI001 | 1 | TRUE | freesurfer | 6.0.1 | 2022-05-24 13:43 | SUCCESS | | ||
| sub-MNI001 | MNI001 | 2 | TRUE | freesurfer | 6.0.1 | 2022-05-24 13:46 | SUCCESS | | ||
| sub-MNI001 | MNI001 | 3 | TRUE | freesurfer | 6.0.1 | UNAVAILABLE | INCOMPLETE | | ||
|
||
The imaging derivatives bagel has one row for each participant-session-pipeline combination. The pipeline status columns are `"pipeline_complete"`, and any column whose name begins by `"PHASE__"` or `"STAGE__"`. The possible values for these columns are: | ||
- `"SUCCESS"`: All expected pipeline output files (as configured by pipeline tracker) are present. | ||
- `"FAIL"`: At least one expected pipeline output is missing. | ||
- `"INCOMPLETE"`: Pipeline has not been run for the subject session (output directory missing). | ||
- `"UNAVAILABLE"`: Relevant MRI modality for pipeline not available for subject session (determined by the `datatype` column in the dataset's manifest file). | ||
|
||
### Adding a tracker | ||
|
||
1. Create a new file in `nipoppy/trackers` called `<new_pipeline>_tracker.py`. | ||
2. Define a config dictionary `tracker_configs`, with a mandatory key `"pipeline_complete"` whose value is a function that takes as input the path to the subject result directory, as well as the session and run IDs, and outputs one of `"SUCCESS"`, `"FAIL"`, `"INCOMPLETE"`, or `"UNAVAILABLE"`. See the built-in [fMRIPrep tracker](https://github.com/neurodatascience/nipoppy/blob/main/nipoppy/trackers/fmriprep_tracker.py) for an example. | ||
3. Optionally add additional stages and phases to track. Again, refer to the [fMRIPrep tracker](https://github.com/neurodatascience/nipoppy/blob/main/nipoppy/trackers/fmriprep_tracker.py) for to any other pre-defined tracker configuration for an example. | ||
4. Modify `nipoppy/trackers/run_tracker.py` to add the new tracker as an option. | ||
|
||
### Visualizing availability status with the Neurobagel [`digest`](https://digest.neurobagel.org/) | ||
|
||
The `bagel.csv` file written by the tracker can be uploaded to [https://digest.neurobagel.org/](https://digest.neurobagel.org/) (as an "imaging CSV file") for interactive visualizations of processing status. | ||
|
||
![digest](../imgs/digest.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.