Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added documentation for word cloud option on summarize hed tags #358

Merged
merged 2 commits into from
Feb 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions docs/_static/data/AOMIC_splitevents_rmdl.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@
},
"stop_signal": {
"onset_source": ["stop_signal_delay"],
"duration": [0.5],
"copy_columns": []
"duration": [0.5]
}
},
"remove_parent_row": false
Expand Down
19 changes: 9 additions & 10 deletions docs/source/FileRemodelingQuickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -277,8 +277,7 @@ The following example shows the remodeling operations to perform the splitting.
},
"stop_signal": {
"onset_source": ["stop_signal_delay"],
"duration": [0.5],
"copy_columns": []
"duration": [0.5]
}
},
"remove_parent_row": false
Expand Down Expand Up @@ -315,7 +314,7 @@ specifying the values of the following parameters.

* *onset_source*
* *duration*
* *copy_columns`*
* *copy_columns*

The *onset_source* is a list indicating how to calculate the onset for the new event
relative to the onset of the anchor event.
Expand All @@ -326,20 +325,20 @@ Column names are evaluated to the row values in the corresponding columns.
In our example, the response time and stop signal delay are calculated relative to the trial's onset,
so we only need to add the value from the respective column.
Note that these new events do not exist for every trial.
Rows where there was no stop signal have an *n/a* in the *stop_signal_delay* column.
Rows where there was no stop signal have an `n/a` in the *stop_signal_delay* column.
This is processed automatically, and remodeler does not create new events
when any items in the *onset_source* list is missing or *n/a*.
when any items in the *onset_source* list is missing or `n/a`.

The *duration* specifies the duration for the new events.
The AOMIC data did not measure the durations of the button presses,
so we set the duration of the response event to 0.
The AOMIC data report indicates that the stop signal lasted 500 ms.

The copy columns indicate which columns from the parent event should be copied to the
The *copy_columns* is an optional parameter indicating which columns from the parent event should be copied to the
newly-created event.
We would like to transfer the *response_accuracy* and the *response_hand* information to the response event.
*Note:* Copy columns is an optional key in the *new_events* dictionary.
If you do not want to carry over any column values from the parent event to the new events you can omit this key.
We would like to transfer the *response_accuracy* and the *response_hand* information to the *response* event.
Since no extra column values are to be transferred for *stop_signal*, columns other than *onset*, *duration*,
and *trial_type* are filled with `n/a`.


The final remodeling file can be found at:
Expand Down Expand Up @@ -438,7 +437,7 @@ There are three types of command line arguments:
and [**Named arguments with values**](./FileRemodelingTools.md#named-arguments).

The positional arguments, `data_dir` and `model_path` are not optional and must
be the first and second arguments to `run_remodel`.
be the first and second arguments to `run_remodel`, respectively.
The named arguments (with and without values) are optional.
They all have default values if omitted.

Expand Down
40 changes: 35 additions & 5 deletions docs/source/FileRemodelingTools.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ The programs use a standard command-line argument list for specifying input as s
| Script name | Arguments | Purpose |
| ----------- | -------- | ------- |
|*run_remodel_backup* | *data_dir*<br/>*-bd -\\-backup-dir*<br/>*-bn -\\-backup-name*<br/>*-e -\\-extensions*<br/>*-f -\\-file-suffix*<br/>*-t -\\-task-names*<br/>*-v -\\-verbose*<br/>*-x -\\-exclude-dirs*| Create a backup event files. |
|*run_remodel* | *data_dir*<br/>*model_path*<br/>*-b -\\-bids-format*<br/>*-bd -\\-backup-dir*<br/>*-bn -\\-backup-name*<br/>*-e -\\-extensions*<br/>*-f -\\-file-suffix*<br/>*-i -\\-individual-summaries*<br/>*-j -\\-json-sidecar*<br/>*-nb -\\-no-backup*<br/>*-ns -\\-no-summaries*<br/>*-nu -\\-no-update*<br/>*-r -\\-hed-version*<br/>*-s -\\-save-formats*<br/>*-t -\\-task-names*<br/>*-v -\\-verbose*<br/>*-w -\\-work-dir*<br/>*-x -\\-exclude-dirs* | Restructure or summarize the event files. |
|*run_remodel* | *data_dir*<br/>*model_path*<br/>*-b -\\-bids-format*<br/>*-bd -\\-backup-dir*<br/>*-bn -\\-backup-name*<br/>*-e -\\-extensions*<br/>*-f -\\-file-suffix*<br/>*-i -\\-individual-summaries*<br/>*-j -\\-json-sidecar*<br/>*-ld -\\-log-dir*<br/>*-nb -\\-no-backup*<br/>*-ns -\\-no-summaries*<br/>*-nu -\\-no-update*<br/>*-r -\\-hed-version*<br/>*-s -\\-save-formats*<br/>*-t -\\-task-names*<br/>*-v -\\-verbose*<br/>*-w -\\-work-dir*<br/>*-x -\\-exclude-dirs* | Restructure or summarize the event files. |
|*run_remodel_restore* | *data_dir*<br/>*-bd -\\-backup-dir*<br/>*-bn -\\-backup-name*<br/>*-t -\\-task-names*<br/>*-v -\\-verbose*<br/> | Restore a backup of event files. |

````
Expand Down Expand Up @@ -305,6 +305,13 @@ Users are free to use either form.
> This option is followed by the full path of the JSON sidecar with HED annotations to be
> applied during the processing of HED-related remodeling operations.

`-ld`, `--log-dir`
> This option is followed by the full path of a directory for writing log files.
> A log file is written if the remodeling tools raise an exception and the program terminates.
> Note that a log file is not written for issues gathered during operations such as `summarize_hed_valistion`
> because reporting HED validation errors is a normal part of this operation.
> On the other hand, errors in the JSON remodeling file do raise and exception and are reported in the log.

`-nb`, `--no-backup`
> If present, no backup is used. Rather operations are performed directly on the files.

Expand Down Expand Up @@ -333,7 +340,7 @@ Users are free to use either form.
> The name(s) of the tasks to be included (for BIDS-formatted files only).
> When a dataset includes multiple tasks, the event files are often structured
> differently for each task and thus require different transformation files.
> This option allows the backups and operations to be restricted to a individual tasks.
> This option allows the backups and operations to be restricted to an individual task.

> If this option is omitted, all tasks are used. This means that all `events.tsv` files are
> restored from a backup if the backup is used, the operations are performed on all `events.tsv` files, and summaries are combined over all tasks.
Expand Down Expand Up @@ -1067,7 +1074,7 @@ This method allows for small gaps between events and for events in which an
intermediate event in the group ends after later events.
If the *set_duration* parameter is false, the duration of the merged row is set to `n/a`.

If the data file has other columns besides `onset`, `duration` and column *column_name*,
If the data file has other columns besides `onset`, `duration` and *column_name*,
the values in the other columns must be considered during the merging process.
The *match_columns* is a list of the other columns whose values must agree with those
of the anchor row in order for a merge to occur. If *match_columns* is empty, the
Expand Down Expand Up @@ -1216,7 +1223,7 @@ based on the unique values in the combination of columns *response_accuracy* and
````
In this example there are two source columns and one destination column,
so each entry in *map_list* must be a list with three elements
two source values and one destination value).
two source values and one destination value.
Since all the values in *map_list* are strings,
the optional *integer_sources* list is not needed.

Expand Down Expand Up @@ -2035,8 +2042,9 @@ The *summarize_hed_tags* operation has the two required parameters
| *tags* | dict | Dictionary with category title keys and tags in that category as values. |
| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
| *include_context* | bool | (**Optional**: Default true) If true, expand the event context to <br/>account for onsets and offsets. |
| *remove_types* | list | (**Optional**) A list of types such as <br/>`Condition-variable` and `Task` to remove. |
| *replace_defs* | bool | (**Optional**: Default true) If true, the `Def` tags are replaced with the<br/>contents of the definition (no `Def` or `Def-expand`). |
| *remove_types* | list | (**Optional**) A list of types such as `Condition-variable` and `Task` to remove. |
| *word_cloud* | dict | (**Optional**) If present, the operation produces a <br/>word cloud image in addition to the summaries. |
```

The *tags* dictionary has keys that specify how the user wishes the tags
Expand All @@ -2052,6 +2060,28 @@ to the event context in events intermediate between onsets and offsets.
If the optional parameter *replace_defs* is true, the tag counts include
tags contributed by contents of the definitions.

If *word_cloud* parameter is provided but its value is empty, the default word cloud settings are used.
The following table lists the optional parameters used to control the appearance of the word cloud image.

```{admonition} Optional keys in the word cloud dictionary value.
:class: tip

| Parameter | Type | Description |
| ------------ | ---- | ----------- |
| *background_color* | str | The matplotlib name of the background color (default "black").|
| *contour_color* | str | The matplotlib name of the contour color if mask provided. |
| *contour_width* | float | Width of contour if mask provided (default 3). |
| *font_path* | str | The path of the system font to use if *set_font* is true. |
| *height* | int | Height in pixels of the image (default 300).|
| *mask_path* | str | The path of the mask image to use if *use_mask* is true<br/> and an image other than the brain is needed. |
| *max_font_size* | float | The maximum font size to use in the image (default 15). |
| *min_font_size* | float | The minimum font size to use in the image (default 8).|
| *prefer_horizontal* | float | Fraction of horizontal words in image (default 0.75). |
| *scale_adjustment* | float | Constant to add to log10 count transformation (default 7). |
| *set_font* | bool | If true, the system font given by *font_path* is used. |
| *use_mask* | dict | If true, a mask image is used to provide a contour around the words. |
| *width* | int | Width in pixels of image (default 400). |
```

(summarize-hed-tags-example-anchor)=
#### Summarize HED tags example
Expand Down
20 changes: 12 additions & 8 deletions src/jupyter_notebooks/bids/validate_bids_dataset.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,15 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 7,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using HEDTOOLS version: {'date': '2024-01-11T08:20:12-0600', 'dirty': False, 'error': None, 'full-revisionid': 'df75b546be42d11be8fd2c0531883e09e5cb6fde', 'version': '0.4.0+109.gdf75b54'}\n",
"24\n"
"Using HEDTOOLS version: {'date': '2024-02-05T17:04:52-0600', 'dirty': True, 'error': None, 'full-revisionid': '7c05a5461ed273b9a39bddf27d99c2e767ca1aab', 'version': '0.4.0+147.g7c05a54.dirty'}\n",
"Number of issues: 0\n",
"No HED validation errors\n"
]
}
],
Expand All @@ -50,7 +51,8 @@
"## Set the dataset location and the check_for_warnings flag\n",
"check_for_warnings = True\n",
"dataset_dir = '../../../datasets/eeg_ds004105s_hed'\n",
"outfile = 'temp.txt'\n",
"# outfile = 'temp.txt'\n",
"outfile = ''\n",
"\n",
"## Validate the dataset\n",
"bids = BidsDataset(dataset_dir)\n",
Expand All @@ -60,16 +62,18 @@
"else:\n",
" issue_str = \"No HED validation errors\"\n",
"# print(issue_str)\n",
"print(f\"{len(issue_str)}\")\n",
"print(f\"Number of issues: {len(issue_list)}\")\n",
"if outfile and issue_list:\n",
" with open(outfile, 'w') as fp:\n",
" fp.write(issue_str)\n"
" fp.write(issue_str)\n",
"else:\n",
" print(f\"{issue_str}\")\n"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-01-19T16:19:47.574743800Z",
"start_time": "2024-01-19T16:19:42.471331100Z"
"end_time": "2024-02-06T13:54:39.249794200Z",
"start_time": "2024-02-06T13:54:34.365938400Z"
}
}
}
Expand Down
Loading