From d7f4edb11ae4cddc157e8ddf7c65011f84fa60cf Mon Sep 17 00:00:00 2001 From: rly Date: Mon, 10 Jun 2024 00:39:55 -0700 Subject: [PATCH] Suggest improvements to tutorial text --- docs/tutorials/dataset.rst | 12 ++-- docs/tutorials/multiple_sessions.rst | 37 ++++++----- docs/tutorials/single_session.rst | 92 ++++++++++++++++------------ 3 files changed, 81 insertions(+), 60 deletions(-) diff --git a/docs/tutorials/dataset.rst b/docs/tutorials/dataset.rst index b3a42a6a6c..091f226844 100644 --- a/docs/tutorials/dataset.rst +++ b/docs/tutorials/dataset.rst @@ -1,8 +1,8 @@ Example Dataset Generation ========================== -Our tutorials focus on converting extracellular electrophysiology data in the SpikeGLX and Phy formats. -To get you started as quickly as possible, we’ve created a way to generate this Neuropixel-like dataset at the click of a button! +The NWB GUIDE tutorials focus on converting extracellular electrophysiology data in the SpikeGLX and Phy formats. +To get started as quickly as possible, you can use NWB GUIDE to generate a Neuropixels-like dataset at the click of a button! .. note:: The **SpikeGLX** data format stores electrophysiology recordings. @@ -17,7 +17,9 @@ Navigate to the **Settings** page using the button at the bottom of the main sid Press the Generate button on the Settings page to create the dataset. -The generated data will populate in the ``~/NWB_GUIDE/test_data`` directory, where ``~`` is the home directory of your system. This includes both a ``single_session_data`` and ``multi_session_dataset`` folder to accompany the following tutorials. +The dataset will be generated in a new ``~/NWB_GUIDE/test_data`` directory, where ``~`` is the `home directory `_ of your system. This includes both a ``single_session_data`` and ``multi_session_dataset`` folder to use in the following tutorials. + +The folder structure of the generated dataset is as follows: .. code-block:: bash @@ -52,6 +54,4 @@ The generated data will populate in the ``~/NWB_GUIDE/test_data`` directory, whe │ │ └── mouse2_Session2/ │ │ ... - - -Now you’re ready to start your first conversion using the NWB GUIDE! +Now you're ready to start your first conversion using the NWB GUIDE! diff --git a/docs/tutorials/multiple_sessions.rst b/docs/tutorials/multiple_sessions.rst index 5d798c49d1..511454aa7c 100644 --- a/docs/tutorials/multiple_sessions.rst +++ b/docs/tutorials/multiple_sessions.rst @@ -1,7 +1,7 @@ Managing Multiple Sessions ========================== -Now, let’s say that you’ve already run some of your experiments and now you want to convert them all at the same time. This is where a multi-session workflow will come in handy. +Now, let's imagine that you've already run multiple sessions of an experiment and now you want to convert them all to NWB at the same time. This is where a multi-session workflow will be useful. Begin a new conversion on the **Convert** page and provide a name for your pipeline. @@ -12,19 +12,23 @@ Update the Workflow page to indicate that you'll: #. Run on multiple sessions #. Locate the source data programmatically -#. Specify your dataset location ``~/NWB_GUIDE/test-data/multi_session_dataset``, where **~** is the home directory of your system. +#. Specify your dataset location ``~/NWB_GUIDE/test-data/multi_session_dataset``, where ``~`` is the home directory of your system. #. Skip dataset publication. +Leave the rest of the settings as they are. + .. figure:: ../assets/tutorials/multiple/workflow-page.png :align: center :alt: Workflow page with multiple sessions and locate data selected Data Formats ------------ + As before, specify **SpikeGLX Recording** and **Phy Sorting** as the data formats for this conversion. Locate Data ----------- + This page helps you automatically identify source data for multiple subjects / sessions as long as your files are organized consistently. .. figure:: ../assets/tutorials/multiple/pathexpansion-page.png @@ -34,33 +38,33 @@ This page helps you automatically identify source data for multiple subjects / s File locations are specified as **format strings** that define source data paths of each selected data format. .. note:: - Format strings are one component of NeuroConv's **path expansion language**, which has some nifty features for manually specifying complex paths. Complete documentation of the path expansion feature of NeuroConv can be found :path-expansion-guide:`here <>`. + Format strings are one component of NeuroConv's **path expansion language**, which has nifty features for manually specifying complex paths. Complete documentation of the path expansion feature can be found :path-expansion-guide:`here <>`. -While you don’t have to specify format strings for all of the pipeline’s data formats, we’re going to find all of our data here for this tutorial. You'll always be able to confirm or manually select the final paths on the Source Data page later in the workflow. +While you don't have to specify format strings for all of the pipeline's data formats, we're going to find all of our data here for this tutorial. You'll always be able to confirm or manually select the final paths on the Source Data page later in the workflow. Format strings are specified using two components: the **base directory**, which is the directory to search in, and the **format string path**, where the source data is within that directory. The base directory has been pre-populated based on your selection on the Workflow page. -To avoid specifying the format string path by hand, we can take advantage of **Autocomplete**. Click the **Autocomplete** button to open a pop-up form that will derive the format string from a single example path. +To avoid specifying the format string path by hand, click the **Autocomplete** button to open a pop-up form that will derive the format string from a single example path. .. figure:: ../assets/tutorials/multiple/pathexpansion-autocomplete-open.png :align: center :alt: Autocomplete modal on path expansion page -Provide an example source data path (for example, the ``multi_session_dataset/mouse1/mouse1_Session2/mouse1_Session2_phy`` file for Phy), followed by the Subject (``mouse1``) and Session ID (``Session1``) for this particular path. +Provide a source data path for Phy by either dragging and dropping the folder ``multi_session_dataset/mouse1/mouse1_Session2/mouse1_Session2_phy`` into the **Example Folder** box or clicking the box and selecting a folder. Then enter the Subject ID (``mouse1``) and Session ID (``Session1``) for this particular path. .. figure:: ../assets/tutorials/multiple/pathexpansion-autocomplete-filled.png :align: center :alt: Autocomplete modal completed -When you submit this form, you’ll notice that the Format String Path input has been auto-filled with a pattern for all the sessions. +When you submit this form, you'll notice that the Format String Path input has been auto-filled with a pattern for all of the sessions, and a list of all of the source data found is shown in the gray box. Confirm that this list contains all four Phy folders. .. figure:: ../assets/tutorials/multiple/pathexpansion-autocomplete-submitted.png :align: center :alt: Path expansion page with autocompleted format string -Repeat this process for SpikeGLX, where ``multi_session_dataset/mouse1/mouse1_Session2/mouse1_Session2_g0/mouse1_Session2_g0_imec0/mouse1_Session1_g0_t0.imec0.lf.bin`` will be the example source data path. +Repeat this process for SpikeGLX, where ``multi_session_dataset/mouse1/mouse1_Session2/mouse1_Session2_g0/mouse1_Session2_g0_imec0/mouse1_Session1_g0_t0.imec0.ap.bin`` will be the example source data path. .. figure:: ../assets/tutorials/multiple/pathexpansion-completed.png :align: center @@ -70,15 +74,16 @@ Advance to the next page when you have entered the data locations for both forma Subject Metadata ---------------- -On this page you’ll edit subject-level metadata across all related sessions. Unlike the previous few pages, you’ll notice that -Sex and Species both have gray asterisks next to their name; this means they are **loose requirements**, which aren’t currently required + +On this page, you can edit subject-level metadata that is the same for all sessions. Unlike the previous few pages, you'll notice that +Sex and Species both have gray asterisks next to their name; this means they are **loose requirements**, which aren't currently required but could later block progress if left unspecified. .. figure:: ../assets/tutorials/multiple/subject-page.png :align: center :alt: Blank subject table -In this case, we have two subjects with two sessions each. Let’s say that each of their sessions happened close enough in time that they can be identified using the same **age** entry: ``P29W`` for ``mouse1`` and ``P30W`` for ``mouse2``. +In this case, we have two subjects with two sessions each. Let's say that each of their sessions happened close enough in time that they can be identified using the same **age** entry: ``P29W`` for ``mouse1`` and ``P30W`` for ``mouse2``. We should also indicate the ``sex`` of each subject since this is a requirement for `uploading to the DANDI Archive `_. @@ -90,16 +95,18 @@ Advance to the next page when you have entered subject metadata for all subjects Source Data Information ----------------------- -Because we used the Locate Data page to programmatically identify our source data, this page should mostly be complete. You can use this opportunity to verify that the identified paths appear as expected for each session. + +Because we used the Locate Data page to programmatically identify our source data, this page should mostly be complete. Verify that the identified paths appear as expected for each session by clicking the "Phy Sorting" header to expand the section for Phy data and examining the "Folder Path" value. Do the same for the SpikeGLX data. .. figure:: ../assets/tutorials/multiple/sourcedata-page.png :align: center :alt: Complete source data forms -One notable difference between this and the single-session workflow, however, is that the next few pages will allow you to toggle between sessions using the **session manager** sidebar on the left. +One notable difference between this and the single-session workflow is that the next few pages will allow you to toggle between sessions using the **session manager** sidebar on the left. Try this out. Under "Sessions", click "sub-mouse2" and "ses-Session1" to locate the source data for a different session from this subject. Session Metadata ---------------- + Aside from the session manager, the file metadata page in the multi-session workflow is nearly identical to the single-session version. .. figure:: ../assets/tutorials/multiple/metadata-nwbfile.png @@ -108,7 +115,7 @@ Aside from the session manager, the file metadata page in the multi-session work A complete General Metadata form -Acting as default metadata, the information supplied on the subject metadata page has pre-filled the Subject metadata for each session. +The information supplied on the Subject Metadata page has been used to fill in the Subject metadata for each session. .. figure:: ../assets/tutorials/multiple/metadata-subject-complete.png :align: center @@ -116,7 +123,7 @@ Acting as default metadata, the information supplied on the subject metadata pag A complete Subject metadata form -You'll notice that there's an **Edit Default Metadata** button at the top of the page. This feature allows you to specify a single default value for each property that is expected to be the same across all sessions. **Use this button to fill in general metadata for your sessions**, which will save you time and effort while ensuring your files still follow Best Practices. +You'll notice that there's an **Edit Default Metadata** button at the top of the page. This feature allows you to specify a single default value for each property that is expected to be the same across all sessions. **Use this button to fill in general metadata for your sessions**, such as the Institution, which will save you time and effort while ensuring your files still follow Best Practices. Finish the rest of the workflow as you would for a single session by completing a full conversion after you review the preview files with the NWB Inspector and Neurosift. diff --git a/docs/tutorials/single_session.rst b/docs/tutorials/single_session.rst index c1931803b3..06f69e9ece 100644 --- a/docs/tutorials/single_session.rst +++ b/docs/tutorials/single_session.rst @@ -1,9 +1,9 @@ Converting a Single Session =========================== -Let's imagine you've just completed an experimental session and you’d like to convert your data to NWB right away. +Let's imagine you've just completed an experimental session and you'd like to convert your data to NWB right away. -Upon launching the GUIDE, you'll begin on the **Convert** page. If you’re opening the application for the first time, there should be no pipelines listed on this page. +Upon launching the GUIDE, you'll begin on the **Convert** page. If you're opening the application for the first time, there should be no pipelines listed on this page. .. figure:: ../assets/tutorials/home-page.png :align: center @@ -17,49 +17,53 @@ Project Structure Project Setup ^^^^^^^^^^^^^ -The Project Setup page will have you define two pieces of information about your pipeline: the **name** and, optionally, the **output location** for your NWB files. We will not be specifying an output location in this tutorial—so your NWB files will be saved to the default location. +To begin, set the **name** for this tutorial pipeline to "Single Session Workflow". The red asterisk next to the "Name" field indicates that this is a required field. -You’ll notice that the name property has a red asterisk next to it, which identifies it as a required property. +You can also set the **output location** for your NWB files, but for this tutorial, leave it as the default location. -.. figure:: ../assets/tutorials/single/info-page.png +.. figure:: ../assets/tutorials/single/valid-name.png :align: center - :alt: Project Setup page with no name (invalid) + :alt: Project Setup page with valid name +Press "Next" to continue. -After specifying a unique project name, the colored background and error message will disappear, allowing you to advance to the next page. +Pipeline Workflow +^^^^^^^^^^^^^^^^^ -.. figure:: ../assets/tutorials/single/valid-name.png - :align: center - :alt: Project Setup page with valid name +On this page, you'll specify the type of **workflow** you'd like to follow for this conversion pipeline. -Workflow Configuration -^^^^^^^^^^^^^^^^^^^^^^ -On this page, you’ll specify the type of **workflow** you’d like to follow for this conversion pipeline. +First, you can set the time zone for the data. If the data was collected in a different time zone than your local time zone, you can search for and select that time zone. For this tutorial, leave it as the default time zone. -Since this is a single-session workflow, you’ll need to specify a **Subject ID** and **Session ID** to identify the data you’ll be converting. +For the next question "Will this pipeline be run on multiple sessions?", keep the "No" button selected. + +For a single-session workflow, you'll need to specify a **Subject ID** and **Session ID** to identify the data you'll be converting. Enter "sub1" for the Subject ID and "ses1" for the Session ID. .. figure:: ../assets/tutorials/single/workflow-page.png :align: center :alt: Workflow page -Additionally, we’ll turn off the option to upload to the DANDI Archive and approach this in a later tutorial. +For this tutorial, turn off the option to publish the data to the DANDI Archive. This will be covered in a later tutorial. + +For the last question "Would you like to customize low-level data storage options?", keep the "No" button selected. + +Press "Next" to continue. Data Formats ^^^^^^^^^^^^ -Next, you’ll specify the data formats you’re working with on the Data Formats page. The GUIDE supports 40+ total neurophysiology formats. A full registry of available formats is available :doc:`here `. + +Next, you'll specify the data formats you're working with. The GUIDE supports 40+ neurophysiology formats. A full registry of available formats is available :doc:`here `. .. figure:: ../assets/tutorials/single/formats-page.png :align: center - :alt: Date Formats page + :alt: Data Formats page -The tutorial we're working with uses the SpikeGLX and Phy formats, a common output for Neuropixels recordings and subsequent spike sorting. To specify that your pipeline will handle these files, you’ll press the “Add Format” button. +This tutorial uses the SpikeGLX format, a common output for Neuropixels recordings, and the Phy format, a common output of spike sorting and curation. To specify that your pipeline will handle these files, press the “Add Format” button. .. figure:: ../assets/tutorials/single/format-options.png :align: center :alt: Format pop-up on the Data Formats page -Then, select the relevant formats—in this case, **SpikeGLX Recording** and **Phy Sorting**—from the pop-up list. Use the search bar to filter for the format you need. - +Then, select the relevant formats—in this case, **SpikeGLX Recording** (not SpikeGLX Converter) and **Phy Sorting**—from the pop-up list. Use the search bar to filter for the format you need. .. figure:: ../assets/tutorials/single/search-behavior.png :align: center @@ -67,12 +71,11 @@ Then, select the relevant formats—in this case, **SpikeGLX Recording** and **P The selected formats will then display above the button. - .. figure:: ../assets/tutorials/single/interface-added.png :align: center :alt: Data Formats page with SpikeGLX Recording added to the list -Advance to the next page when you have **SpikeGLX Recording** and **Phy Sorting** selected. +Press "Next" after you have selected **SpikeGLX Recording** and **Phy Sorting**. .. figure:: ../assets/tutorials/single/all-interfaces-added.png :align: center @@ -83,31 +86,39 @@ Data Entry Source Data Information ^^^^^^^^^^^^^^^^^^^^^^^ + On this page, specify the **phy** folder and **.ap.bin** (SpikeGLX) file so that the GUIDE can find this source data to complete the conversion. -As discussed in the :doc:`Dataset Generation ` tutorial, these can be found in the ``~/NWB_GUIDE/test-data/single_session_data`` directory, where **~** is the home directory of your system. +As discussed in the :doc:`Dataset Generation ` tutorial, these can be found in the ``~/NWB_GUIDE/test-data/single_session_data`` directory, where ``~`` is the home directory of your system. If you just generated the dataset, this folder may still be open in your file navigator. -Within each data format accordion, you'll find a file selector that will accept relevant source data. You can either click this to navigate to your files or drag-and-drop into the GUIDE from your file navigator. +Click the **Phy Sorting** header to expand the section. Under "Folder Path", you can either drag-and-drop the **phy** folder into the box from your file navigator or click the box to navigate to and select the **phy** folder. .. figure:: ../assets/tutorials/single/sourcedata-page-specified.png :align: center :alt: Source Data page with source locations specified -Advance to the next page to extract metadata from the source data. +Next, click the **SpikeGLX Recording** header to expand the section. Under "File Path", you can either click the box to navigate to the **.ap.bin** file or drag-and-drop the **.ap.bin** file into the box from your file navigator. The **.ap.bin** file is located in the ``~/NWB_GUIDE/test-data/single_session_data/spikeglx/Session1_g0/Session1_g0_imec0`` folder. + +Press "Next" to extract metadata from these source data files and folders. Session Metadata ^^^^^^^^^^^^^^^^ -The file metadata page is a great opportunity to add rich annotations to the file, which will be read by anyone reusing your data in the future! -The Session Start Time in the **General Metadata** section is already specified because this field was automatically extracted from the SpikeGLX source data. +The file metadata page is a great opportunity to add rich annotations to the NWB file, which will be read by anyone reusing your data in the future! + +Click the **General Metadata** header to expand the section. + +The Session Start Time is already specified because this field was automatically extracted from the SpikeGLX source data. .. figure:: ../assets/tutorials/single/metadata-nwbfile.png :align: center :alt: Metadata page with invalid Subject information -While the **General Metadata** section is complete, take some time to fill out additional information such as the **Institutional Info** box and the **Experimenter** field. +The **General Metadata** header is underlined yellow because all required fields have been set, but some recommended fields are missing values, such as **Institution** and **Experiment Description**. These fields are not required, but they can be useful for future users of the data. + +Take a minute to fill out some of these fields, such as the fields in the **Institutional Info** box and the **Experimenter** field. -We also need to add the **Subject** information—as noted by the red accents around that item. Let’s say that our subject is a male mouse with an age of P25W, which represents 25 weeks old. +The **Subject** header is underlined red, indicating that required fields are missing values. Click the **Subject** header to expand the section. The subject's **sex**, **species**, and **age** are missing. Select "Male" for **sex**, "Mus musculus - House mouse" for **species**, and "P25W", which represents 25 weeks old, for **age**. .. figure:: ../assets/tutorials/single/metadata-subject-complete.png :align: center @@ -115,15 +126,13 @@ We also need to add the **Subject** information—as noted by the red accents ar The status of the Subject information will update in real-time as you fill out the form. - -This dataset will also have **Ecephys** metadata extracted from the SpikeGLX source data, though we aren't interested in modifying this information at the moment. +Click the **Ecephys** header to expand the section. Ecephys is short-hand for "extracellular electrophysiology". This section contains metadata about the probes and electrodes used. For the test SpikeGLX data, these metadata have been extracted from the SpikeGLX source data. You do not need to modify it in this tutorial. .. figure:: ../assets/tutorials/single/metadata-ecephys.png :align: center :alt: Ecephys metadata extracted from the SpikeGLX source data - -Let's leave this as-is and advance to the next page. This will trigger the conversion of your source data into a preview NWB file. +Press "Next" to trigger the conversion of a small part of your source data into a preview NWB file. File Conversion --------------- @@ -131,19 +140,22 @@ File Conversion Inspector Report ^^^^^^^^^^^^^^^^ -The Inspector Report page allows you to validate the preview file against the latest Best Practices and make suggestions to improve the content or representations. +This page shows the output of the NWB Inspector tool, which validated your preview NWB file against the latest NWB Best Practices. Red boxes represent errors, and yellow boxes represent best practice warnings that could be ignored. .. figure:: ../assets/tutorials/single/inspect-page.png :align: center :alt: NWB Inspector report -Advance to the next page when you are satisfied with the Inspector Report. +When you are satisfied with the Inspector Report, press "Next". Conversion Preview ^^^^^^^^^^^^^^^^^^ -On the Conversion Preview, Neurosift allows you to explore the structure of the NWB file and ensure the packaged data matches your expectations. -In particular, take a look at the lefthand metadata table and check that the information provided on the previous pages is present in the NWB file. +This page uses the Neurosift tool to allow you to explore the structure of your NWB file so that you can ensure the packaged data matches your expectations. + +In particular, take a look at the lefthand metadata table and check that the information you provided on the previous pages is present in the NWB file. + +Expand the yellow "acquisition" section and select "ElectricalSeriesAP" to view a plot of the test SpikeGLX data. .. figure:: ../assets/tutorials/single/preview-page.png :align: center @@ -151,12 +163,14 @@ In particular, take a look at the lefthand metadata table and check that the inf Neurosift can be useful for many other exploration tasks—but this will not be covered in this tutorial. -Advancing from this page will trigger the full conversion of your data to the NWB format, a process that may take some time depending on the dataset size. +The NWB file shown here is just a preview NWB file that was created using only a small part of the source data. Press "Run Conversion" to trigger the full conversion of your data to the NWB format. This conversion that may take some time depending on the dataset size. Conversion Review ^^^^^^^^^^^^^^^^^ -Congratulations on finishing your first conversion of neurophysiology files using the NWB GUIDE! +Congratulations on finishing your first conversion of neurophysiology data to NWB using the NWB GUIDE! Click the file name ``sub-sub1_ses-ses1.nwb`` to view the location of the NWB file in your file navigator. + +If you had other data to add to the NWB file that are in formats not supported by NWB GUIDE, you can use PyNWB (Python) or MatNWB (MATLAB) to open the NWB file and add the data programmatically. See the documentation links at the bottom of the "Conversion Review" page for tutorials and more information. .. figure:: ../assets/tutorials/single/conversion-results-page.png :align: center