From 35b4048f44412bb7e52669b29d08d5ba84552163 Mon Sep 17 00:00:00 2001
From: Kay Robbins <1189050+VisLab@users.noreply.github.com>
Date: Fri, 2 Feb 2024 14:18:40 -0600
Subject: [PATCH] Many corrections to the remodeling tools documentation

---
 docs/source/FileRemodelingTools.md | 88 ++++++++++++++++--------------
 docs/source/HedMatlabTools.md      |  5 ++
 2 files changed, 51 insertions(+), 42 deletions(-)
diff --git a/docs/source/FileRemodelingTools.md b/docs/source/FileRemodelingTools.md
index d9b0fbc..01fb569 100644
--- a/docs/source/FileRemodelingTools.md
+++ b/docs/source/FileRemodelingTools.md
@@ -711,8 +711,7 @@ from the data file if the columns exist.
         "operation": "remove_columns",
         "description": "Remove unwanted columns prior to analysis",
         "parameters": {
-            "remove_names": ["value", "sample"],
-            "ignore_missing": true
+            "remove_names": ["value", "sample"]
         }
     }
 ]
@@ -829,7 +828,7 @@ based on column values.
 | ------------ | ---- | ----------- | 
 | *column_name* | str | The name of the column to be factored.| 
 | *factor_values* | list | Column values to be included as factors. |
-| *factor_names* | list| Column names for created factors. |
+| *factor_names* | list| (**Optional**) Column names for created factors. |
 ```
 
 If *column_name* is not a column in the data file, a `ValueError` is raised.
@@ -841,8 +840,8 @@ If a specified value is missing in a particular file, the corresponding factor c
 If *factor_names* is empty, the newly created columns are of the 
 form *column_name.factor_value*.
 Otherwise, the newly created columns have names *factor_names*.
-If *factor_names* is not empty, then *factor_values* must also be specified and
-both lists must be of the same length.
+If *factor_names* is not empty, then *factor_values* must also be specified
+and both lists must be of the same length.
 
 (factor-column-example-anchor)=
 #### Factor column example
@@ -906,9 +905,9 @@ The [**HED search guide**](./HedSearchGuide.md) tutorial discusses the HED searc
 |  Parameter   | Type | Description | 
 | ------------ | ---- | ----------- | 
 | *queries* | list | A list of HED query strings. | 
-| *query_names* | list | A list of names for the resulting factor columns generated by the queries. |
-| *remove_types* | list | Structural HED tags to be removed (usually `Condition-variable` and `Task`). | 
-| *expand_context* | bool | (Optional) Expand the context and remove `Onse` and`Offset` tags before the query. | 
+| *query_names* | list | (**Optional**) A list of names for the factor columns generated by the queries. |
+| *remove_types* | list | (**Optional**) Structural HED tags to be removed (usually `Condition-variable` and `Task`). | 
+| *expand_context* | bool | (**Optional**: default True) Expand the context and remove <br/>`Onset` and`Offset` tags before the query. | 
 
 ```
 The *query_names* list, which must be empty or the same length as *queries*,
@@ -916,7 +915,10 @@ contains the names of the factor columns produced by the search.
 If the *query_names* list is empty, the result columns are titled "query_1",
 "query_2", etc.
 
-The *remove_types* and *expand_context* are not yet implemented, and hence ignored in the current release.
+Most of the time the *remove_types* should be set to `["Condition-variable", "Task"]` and the effects of
+the experimental design captured using the `factor_hed_types_op`.
+If *expand_context* is set to *false*, the additional context provided by `Onset`, `Offset`, and `Duration`
+is ignored.
 
 (factor-hed-tags-example-anchor)=
 #### Factor HED tags example
@@ -936,7 +938,7 @@ The resulting factor columns are named *correct* and *incorrect*, respectively.
     "parameters": {
         "queries": ["correct-action", "incorrect-action"],
         "query_names": ["correct", "incorrect"],
-        "remove_types": [],
+        "remove_types": ["Condition-variable", "Task"],
         "expand_context": false
     }
 }]
@@ -986,8 +988,10 @@ For additional information on how to encode experimental designs using HED, see
 |  Parameter   | Type | Description | 
 | ------------ | ---- | ----------- | 
 | *type_tag* | str | HED tag used to find the factors (most commonly *Condition-variable*).| 
-| *type_values* | list | Values to factor for the *type_tag*.<br>If empty, all values of that *type_tag* are used. |
+| *type_values* | list | (**Optional**) Values to factor for the *type_tag*.<br>If omitted, all values of that *type_tag* are used. |
 ```
+The event context (as defined by onsets, offsets and durations) is always expanded and one-hot (0's and 1's)
+encoding is used for the factors.
 
 (factor-hed-type-example-anchor)=
 #### Factor HED type example
@@ -1006,8 +1010,7 @@ applies and 0's otherwise.
     "operation": "factor_hed_type",
     "description": "Factor based on the sex of the images being presented.",
     "parameters": {
-        "type_tag": "Condition-variable",
-        "type_values": []
+        "type_tag": "Condition-variable"
     }
 }]
 ```
@@ -1047,9 +1050,9 @@ duration updated to encompass the temporal extent of the merged events.
 | ------------ | ---- | ----------- | 
 | *column_name* | str | The name of the column which is the basis of the merge.| 
 | *event_code* | str, int, float | The value in *column_name* that triggers the merge. | 
-| *match_columns* | list | Columns whose values must match to collapse events.  |
 | *set_durations* | bool | If true, set durations based on merged events. |
-| *ignore_missing* | bool | If true, missing *column_name* or *match_columns* do not raise an error. |  
+| *ignore_missing* | bool | If true, missing *column_name* or *match_columns* do not raise an error. | 
+| *match_columns* | list | (**Optional**) Columns whose values must match to collapse events.  | 
 ```
 
 The first of the group of rows (each representing an event) to be merged is called the anchor
@@ -1088,9 +1091,9 @@ have the same values to be merged into a single event.
     "parameters": {
         "column_name": "trial_type",
         "event_code": "succesful_stop",
-        "match_columns": ["stop_signal_delay", "response_hand", "sex"],
         "set_durations": true,
-        "ignore_missing": true
+        "ignore_missing": true,
+        "match_columns": ["stop_signal_delay", "response_hand", "sex"]
     }
 }]
 ```
@@ -1161,7 +1164,7 @@ Remapping can be used to convert the column containing these codes into one or m
 | *destination_columns* | list | A list of *n* names of the destination columns for the map. |
 | *map_list* | list | A list of mappings. Each element is a list of *m* source <br/>column values followed by *n* destination values.<br/> Mapping source values are treated as strings. |  
 | *ignore_missing* | bool | If false, source column values not in the map generate "n/a"<br/> destination values instead of errors. |
-| *integer_sources* | list | [**Optional**] A list of source columns that are integers.<br/> The *integer_sources* must be a subset of *source_columns*. |
+| *integer_sources* | list | (**Optional**) A list of source columns that are integers.<br/> The *integer_sources* must be a subset of *source_columns*. |
 ```
 A column cannot be both a source and a destination,
 and all source columns must be present in the data files.
@@ -1169,7 +1172,7 @@ New columns are created for destination columns that are missing from a data fil
 
 The *remap_columns* operation only works for columns containing strings or integers,
 as it is meant for remapping categorical codes.
-You must specify the which source columns contain integers so that `n/a` values
+You must specify which source columns contain integers so that `n/a` values
 can be handled appropriately.
 
 The *map_list* parameter specifies how each unique combination of values from the source 
@@ -1490,6 +1493,7 @@ The results of executing the previous *reorder_columns* transformation on the
 
 The *split_rows* operation
 is often used to convert event files from trial-level encoding to event-level encoding.
+This operation is meant only for tabular files that have `onset` and `duration` columns.
 
 In **trial-level** encoding, all the events in a single trial
 (usually some variation of the cue-stimulus-response-feedback-ready sequence)
@@ -1515,7 +1519,6 @@ In this case a trial consists of a sequence of multiple events.
 
 ```
 
-
 The *split_rows* operation requires an *anchor_column*, which could be an existing
 column or a new column to be appended to the data.
 The purpose of the *anchor_column* is to hold the codes for the new events.
@@ -1651,7 +1654,7 @@ all summaries.
 | ------------ | ---- | ----------- | 
 | *summary_name* | str | A unique name used to identify this summary.| 
 | *summary_filename* | str | A unique file basename to use for saving this summary. |  
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default. |
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
 ```
 
 (summarize-column-names-example-anchor)=
@@ -1730,11 +1733,11 @@ The following table lists the parameters required for using the summary.
 | ------------ | ---- | ----------- | 
 | *summary_name* | str | A unique name used to identify this summary.| 
 | *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *skip_columns* | list | A list of column names to omit from the summary.| 
-| *value_columns* | list | A list of columns to omit the listing unique values. |  
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|  
-| *max_categorical* | int | (Optional) If given, the text summary shows top *max_categorical* values.<br/>Otherwise the text summary displays all categorical values.|  
-| *values_per_line* | bool | (Optional) If given, the text summary displays this <br/>number of values per line (default is 5).|   
+| *append_timecode* | bool | (**Optional**: Default false) If True, append a time code to filename. |  
+| *max_categorical* | int | (**Optional**: Default 50) If given, the text summary shows top *max_categorical* values.<br/>Otherwise the text summary displays all categorical values.|   
+| *skip_columns* | list | (**Optional**) A list of column names to omit from the summary.| 
+| *value_columns* | list | (**Optional**) A list of columns to omit the listing unique values. |  
+| *values_per_line* | int | (**Optional**: Default 5) If given, the text summary displays this <br/>number of values per line (default is 5).|   
 
 ```
 
@@ -1866,10 +1869,11 @@ The following table lists the parameters required for using the summary.
 | ------------ | ---- | ----------- | 
 | *summary_name* | str | A unique name used to identify this summary.| 
 | *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
 ```
 
-The *summarize_definitions* is mainly meant for verifying consistency in unknown `Def-expand` tags.  This comes up where you have an assembled dataset, but no longer have the definitions stored (or never created them to begin with).
+The *summarize_definitions* is mainly meant for verifying consistency in unknown `Def-expand` tags.
+This comes up where you have an assembled dataset, but no longer have the definitions stored (or never created them to begin with).
 
 
 (summarize-definitions-example-anchor)=
@@ -2029,10 +2033,10 @@ The *summarize_hed_tags* operation has the two required parameters
 | *summary_name* | str | A unique name used to identify this summary.| 
 | *summary_filename* | str | A unique file basename to use for saving this summary. |
 | *tags* | dict | Dictionary with category title keys and tags in that category as values. |  
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|  
-| *include_context* | bool | (Optional) If true, expand the event context to <br/>account for onsets and offsets. |  
-| *replace_defs* | bool | (Optional) If true, the `Def` tags are replaced with the<br/>contents of the definition (no `Def` or `Def-expand`). |
-| *remove_types* | list | (Optional) A list of types (such as `Condition-variable` and `Task` to remove. |
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |  
+| *include_context* | bool | (**Optional**: Default true) If true, expand the event context to <br/>account for onsets and offsets. |  
+| *replace_defs* | bool | (**Optional**: Default true) If true, the `Def` tags are replaced with the<br/>contents of the definition (no `Def` or `Def-expand`). |
+| *remove_types* | list | (**Optional**) A list of types such as `Condition-variable` and `Task` to remove. |
 ```
 
 The *tags* dictionary has keys that specify how the user wishes the tags 
@@ -2159,7 +2163,7 @@ This summary provides useful information about experimental design.
 | *summary_name* | str | A unique name used to identify this summary.| 
 | *summary_filename* | str | A unique file basename to use for saving this summary. |
 | *type_tag* | str | Tag to produce a summary for (most often *condition-variable*).|  
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.| 
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename.| 
 ```
 In addition to the two standard parameters (*summary_name* and *summary_filename*),
 the *type_tag* parameter is required.
@@ -2251,8 +2255,8 @@ If *check_for_warnings* is false, the summary will not report warnings.
 | ------------ | ---- | ----------- | 
 | *summary_name* | str | A unique name used to identify this summary.| 
 | *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|  
-| *check_for_warnings* | bool | (Optional) If true, warnings are reported in addition to errors.<br/>False is the default.|  
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |  
+| *check_for_warnings* | bool | (**Optional**: Default false) If true, warnings are reported in addition to errors. |  
 ```
 The *summarize_hed_validation* is a HED operation and the calling program must provide a HED schema version
 and usually a JSON sidecar containing the HED annotations.
@@ -2622,13 +2626,13 @@ since the names specified in the first parameter are meant to represent the quer
 The check only takes place if `query_names` exists, since naming is handled automatically otherwise.
 
 ```python
-    @staticmethod
-    def validate_input_data(parameters):
-        errors = []
-        if parameters.get("query_names", False):
-            if len(parameters.get("query_names")) != len(parameters.get("queries")):
-                errors.append("The list in query_names, in the factor_hed_tags operation, should have the same number of items as queries.")
-        return errors
+@staticmethod
+def validate_input_data(parameters):
+    errors = []
+    if parameters.get("query_names", False):
+        if len(parameters.get("query_names")) != len(parameters.get("queries")):
+            errors.append("The list in query_names, in the factor_hed_tags operation, should have the same number of items as queries.")
+    return errors
 ```
 
 
diff --git a/docs/source/HedMatlabTools.md b/docs/source/HedMatlabTools.md
index 01b637d..53c6b63 100644
--- a/docs/source/HedMatlabTools.md
+++ b/docs/source/HedMatlabTools.md
@@ -595,10 +595,15 @@ Python may be installed in your user space or in system space for all users.
 - You may want to add the location of the Python executable to your PATH.
   (Most installers give you that option as part of the installation.)
 
+#### Installing in a virtual environment
 
+https://www.mathworks.com/support/search.html/answers/1750425-python-virtual-environments-with-python-interface.html?fq%5B%5D=asset_type_name:answer&page=1
 (step-3-connect-python-to-matlab-anchor)=
 #### Step 3: Connect Python to Matlab
 
+
+C:\Users\username\AppData\Local\Programs\Python\python -m venv C:\Users\username\py38 
+
 Setting the Python version uses the MATLAB `pyenv` function with the `'Version'` argument
 as illustrated by the following example.