Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge main into watters branch #15

Closed
wants to merge 30 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
46ddf0a
vprobe locations
luiztauffer Dec 6, 2023
be698aa
Use set_probe and simplify interface by using BinaryRecording
alejoe91 Dec 7, 2023
7d00403
Fix WattersInterface
alejoe91 Dec 10, 2023
39e1b2f
Merge branch 'main' into vprobe-electrodes-locations
CodyCBakerPhD Dec 18, 2023
0b2b11a
fix session ID for automatic upload
CodyCBakerPhD Dec 18, 2023
867f698
Merge branch 'main' into fix_session_id
CodyCBakerPhD Dec 18, 2023
9d3d2cf
Merge branch 'main' into vprobe-electrodes-locations
CodyCBakerPhD Dec 18, 2023
3557448
Merge pull request #11 from catalystneuro/fix_session_id
CodyCBakerPhD Dec 18, 2023
c3eb205
Merge branch 'main' into vprobe-electrodes-locations
CodyCBakerPhD Dec 18, 2023
ecb63d6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2023
c030943
Nick's changes.
Dec 18, 2023
5f55560
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 18, 2023
6a27f4c
Update session_id.
Dec 19, 2023
a826f30
fix circular import error
luiztauffer Dec 19, 2023
8d0700d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 19, 2023
60464d4
Merge pull request #8 from catalystneuro/vprobe-electrodes-locations
CodyCBakerPhD Dec 19, 2023
4db621d
resolve conflicts
CodyCBakerPhD Dec 19, 2023
b8a4b9f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 19, 2023
004b37f
Incorporate V-Probe coordinates.
Dec 19, 2023
fbf1ada
Merge with branch updates.
Dec 19, 2023
36c53bb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 19, 2023
1c110b2
Remove watters_convert_session.py that is no longer in use.
Dec 19, 2023
c5714fc
Merge auto-formatting changes.
Dec 19, 2023
055c715
Remove duplicate file timeseries_interfaces.py.
Dec 19, 2023
3b8fbe0
Merge pull request #13 from catalystneuro/nwatters
nwatters01 Dec 19, 2023
3245d46
Filename updates to make dandi uploading work.
Dec 19, 2023
4f2e291
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 19, 2023
101a2da
Merge pull request #14 from catalystneuro/nwatters
CodyCBakerPhD Dec 19, 2023
7eb0f4f
Update timezone info and minor cleanups.
Dec 20, 2023
54b431b
Merge branch 'nwatters' into main
nwatters01 Dec 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 18 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,27 +40,22 @@ Each conversion is organized in a directory of its own in the `src` directory:
└── src
├── jazayeri_lab_to_nwb
│ ├── watters
│ ├── wattersbehaviorinterface.py
│ ├── watters_convert_session.py
│ ├── watters_metadata.yml
│ ├── wattersnwbconverter.py
│ ├── watters_requirements.txt
│ ├── watters_notes.md

│ ├── behavior_interface.py
│ ├── main_convert_session.py
│ ├── metadata.yml
│ ├── nwb_converter.py
│ ├── requirements.txt
│ └── __init__.py

│ └── another_conversion

└── __init__.py

For example, for the conversion `watters` you can find a directory located in `src/jazayeri-lab-to-nwb/watters`. Inside each conversion directory you can find the following files:

* `watters_convert_sesion.py`: this script defines the function to convert one full session of the conversion.
* `watters_requirements.txt`: dependencies specific to this conversion.
* `watters_metadata.yml`: metadata in yaml format for this specific conversion.
* `wattersbehaviorinterface.py`: the behavior interface. Usually ad-hoc for each conversion.
* `wattersnwbconverter.py`: the place where the `NWBConverter` class is defined.
* `watters_notes.md`: notes and comments concerning this specific conversion.
* `main_convert_sesion.py`: this script defines the function to convert one full session of the conversion.
* `requirements.txt`: dependencies specific to this conversion.
* `metadata.yml`: metadata in yaml format for this specific conversion.
* `behavior_interface.py`: the behavior interface. Usually ad-hoc for each conversion.
* `nwb_converter.py`: the place where the `NWBConverter` class is defined.

The directory might contain other files that are necessary for the conversion but those are the central ones.

Expand All @@ -73,15 +68,16 @@ pip install -r src/jazayeri_lab_to_nwb/watters/watters_requirements.txt

You can run a specific conversion with the following command:
```
python src/jazayeri_lab_to_nwb/watters/watters_convert_session.py
python src/jazayeri_lab_to_nwb/watters/main_convert_session.py $SUBJECT $SESSION
```

### Watters working memory task data
The conversion function for this experiment, `session_to_nwb`, is found in `src/watters/watters_convert_session.py`. The function takes three arguments:
* `data_dir_path` points to the root directory for the data for a given session.
* `output_dir_path` points to where the converted data should be saved.
The conversion function for this experiment, `session_to_nwb`, is found in `src/watters/main_convert_session.py`. The function takes arguments:
* `subject` subject name, either `'Perle'` or `'Elgar'`.
* `session` session date in format `'YYYY-MM-DD'`.
* `stub_test` indicates whether only a small portion of the data should be saved (mainly used by us for testing purposes).
* `overwrite` indicates whether existing NWB files at the auto-generated output file paths should be overwritten.
* `overwrite` indicates whether to overwrite nwb output files.
* `dandiset_id` optional dandiset ID.

The function can be imported in a separate script with and run, or you can run the file directly and specify the arguments in the `if name == "__main__"` block at the bottom.

Expand Down Expand Up @@ -111,8 +107,8 @@ The function expects the raw data in `data_dir_path` to follow this structure:
└── spikeglx
...

The conversion will try to automatically fetch metadata from the provided data directory. However, some information, such as the subject's name and age, must be specified by the user in the file `src/jazayeri_lab_to_nwb/watters/watters_metadata.yaml`. If any of the automatically fetched metadata is incorrect, it can also be overriden from this file.
The conversion will try to automatically fetch metadata from the provided data directory. However, some information, such as the subject's name and age, must be specified by the user in the file `src/jazayeri_lab_to_nwb/watters/metadata.yaml`. If any of the automatically fetched metadata is incorrect, it can also be overriden from this file.

The converted data will be saved in two files, one called `{session_id}_raw.nwb`, which contains the raw electrophysiology data from the Neuropixels and V-Probes, and one called `{session_id}_processed.nwb` with behavioral data, trial info, and sorted unit spiking.

If you run into memory issues when writing the `{session_id}_raw.nwb` files, you may want to set `buffer_gb` to a value smaller than 1 (its default) in the `conversion_options` dicts for the recording interfaces, i.e. [here](https://github.com/catalystneuro/jazayeri-lab-to-nwb/blob/vprobe_dev/src/jazayeri_lab_to_nwb/watters/watters_convert_session.py#L49) and [here](https://github.com/catalystneuro/jazayeri-lab-to-nwb/blob/vprobe_dev/src/jazayeri_lab_to_nwb/watters/watters_convert_session.py#L71).
If you run into memory issues when writing the `{session_id}_raw.nwb` files, you may want to set `buffer_gb` to a value smaller than 1 (its default) in the `conversion_options` dicts for the recording interfaces, i.e. [here](https://github.com/catalystneuro/jazayeri-lab-to-nwb/blob/vprobe_dev/src/jazayeri_lab_to_nwb/watters/main_convert_session.py#L189).
11 changes: 6 additions & 5 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
neuroconv==0.4.4
spikeinterface==0.98.2
nwbwidgets
nwbinspector
pre-commit
neuroconv==0.4.7
spikeinterface==0.99.1
nwbwidgets==0.11.3
nwbinspector==0.4.31
pre-commit==3.6.0
ndx-events==0.2.0
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# -*- coding: utf-8 -*-
from pathlib import Path
from setuptools import setup, find_packages

from setuptools import find_packages, setup

requirements_file_path = Path(__file__).parent / "requirements.txt"
with open(requirements_file_path) as file:
Expand Down
56 changes: 56 additions & 0 deletions src/jazayeri_lab_to_nwb/watters/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Watters data conversion pipeline
NWB conversion scripts for Watters data to the [Neurodata Without Borders](https://nwb-overview.readthedocs.io/) data format.


## Usage
To run a specific conversion, you might need to install first some conversion specific dependencies that are located in each conversion directory:
```
pip install -r src/jazayeri_lab_to_nwb/watters/watters_requirements.txt
```

You can run a specific conversion with the following command:
```
python src/jazayeri_lab_to_nwb/watters/main_convert_session.py $SUBJECT $SESSION
```

### Watters working memory task data
The conversion function for this experiment, `session_to_nwb`, is found in `src/watters/main_convert_session.py`. The function takes arguments:
* `subject` subject name, either `'Perle'` or `'Elgar'`.
* `session` session date in format `'YYYY-MM-DD'`.
* `stub_test` indicates whether only a small portion of the data should be saved (mainly used by us for testing purposes).
* `overwrite` indicates whether to overwrite nwb output files.
* `dandiset_id` optional dandiset ID.

The function can be imported in a separate script with and run, or you can run the file directly and specify the arguments in the `if name == "__main__"` block at the bottom.

The function expects the raw data in `data_dir_path` to follow this structure:

data_dir_path/
├── data_open_source
│ ├── behavior
│ │ └── eye.h.times.npy, etc.
│ ├── task
│ └── trials.start_times.json, etc.
│ └── probes.metadata.json
├── raw_data
│ ├── spikeglx
│ └── */*/*.ap.bin, */*/*.lf.bin, etc.
│ ├── v_probe_0
│ └── raw_data.dat
│ └── v_probe_{n}
│ └── raw_data.dat
├── spike_sorting_raw
│ ├── np
│ ├── vp_0
│ └── vp_{n}
├── sync_pulses
├── mworks
├── open_ephys
└── spikeglx
...

The conversion will try to automatically fetch metadata from the provided data directory. However, some information, such as the subject's name and age, must be specified by the user in the file `src/jazayeri_lab_to_nwb/watters/metadata.yaml`. If any of the automatically fetched metadata is incorrect, it can also be overriden from this file.

The converted data will be saved in two files, one called `{session_id}_raw.nwb`, which contains the raw electrophysiology data from the Neuropixels and V-Probes, and one called `{session_id}_processed.nwb` with behavioral data, trial info, and sorted unit spiking.

If you run into memory issues when writing the `{session_id}_raw.nwb` files, you may want to set `buffer_gb` to a value smaller than 1 (its default) in the `conversion_options` dicts for the recording interfaces, i.e. [here](https://github.com/catalystneuro/jazayeri-lab-to-nwb/blob/vprobe_dev/src/jazayeri_lab_to_nwb/watters/main_convert_session.py#L189).
4 changes: 0 additions & 4 deletions src/jazayeri_lab_to_nwb/watters/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +0,0 @@
from .wattersbehaviorinterface import WattersEyePositionInterface, WattersPupilSizeInterface
from .watterstrialsinterface import WattersTrialsInterface
from .wattersrecordinginterface import WattersDatRecordingInterface
from .wattersnwbconverter import WattersNWBConverter
91 changes: 91 additions & 0 deletions src/jazayeri_lab_to_nwb/watters/display_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
"""Class for converting data about display frames."""
import itertools
import json
from pathlib import Path
from typing import Optional

import numpy as np
import pandas as pd
from neuroconv.datainterfaces.text.timeintervalsinterface import TimeIntervalsInterface
from neuroconv.utils import FolderPathType
from pynwb import NWBFile


class DisplayInterface(TimeIntervalsInterface):
"""Class for converting data about display frames.

All events that occur exactly once per display update are contained in this
interface.
"""

KEY_MAP = {
"frame_object_positions": "object_positions",
"frame_fixation_cross_scale": "fixation_cross_scale",
"frame_closed_loop_gaze_position": "closed_loop_eye_position",
"frame_task_phase": "task_phase",
"frame_display_times": "start_time",
}

def __init__(self, folder_path: FolderPathType, verbose: bool = True):
super().__init__(file_path=folder_path, verbose=verbose)

def get_metadata(self) -> dict:
metadata = super().get_metadata()
metadata["TimeIntervals"] = dict(
display=dict(
table_name="display",
table_description="data about each displayed frame",
)
)
return metadata

def get_timestamps(self) -> np.ndarray:
return super(DisplayInterface, self).get_timestamps(column="start_time")

def set_aligned_starting_time(self, aligned_starting_time: float) -> None:
self.dataframe.start_time += aligned_starting_time

def _read_file(self, file_path: FolderPathType):
# Create dataframe with data for each frame
trials = json.load(open(Path(file_path) / "trials.json", "r"))
frames = {
k_mapped: list(itertools.chain(*[d[k] for d in trials])) for k, k_mapped in DisplayInterface.KEY_MAP.items()
}

# Serialize object_positions data for hdf5 conversion to work
frames["object_positions"] = [json.dumps(x) for x in frames["object_positions"]]

return pd.DataFrame(frames)

def add_to_nwbfile(self, nwbfile: NWBFile, metadata: Optional[dict] = None, tag: str = "display"):
return super(DisplayInterface, self).add_to_nwbfile(
nwbfile=nwbfile,
metadata=metadata,
tag=tag,
column_descriptions=self.column_descriptions,
)

@property
def column_descriptions(self):
column_descriptions = {
"object_positions": (
"For each frame, a serialized list with one element for each "
"object. Each element is an (x, y) position of the "
"corresponding object, in coordinates of arena width."
),
"fixation_cross_scale": (
"For each frame, the scale of the central fixation cross. "
"Fixation cross scale grows as the eye position deviates from "
"the center of the fixation cross, to provide a cue to "
"maintain good fixation."
),
"closed_loop_eye_position": (
"For each frame, the eye position in the close-loop task "
"engine. This was used to for real-time eye position "
"computations, such as saccade detection and reward delivery."
),
"task_phase": "The phase of the task for each frame.",
"start_time": "Time of display update for each frame.",
}

return column_descriptions
112 changes: 112 additions & 0 deletions src/jazayeri_lab_to_nwb/watters/get_session_paths.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
"""Function for getting paths to data on openmind."""

import collections
import pathlib

SUBJECT_NAME_TO_ID = {
"Perle": "monkey0",
"Elgar": "monkey1",
}

SessionPaths = collections.namedtuple(
"SessionPaths",
[
"output",
"raw_data",
"data_open_source",
"task_behavior_data",
"sync_pulses",
"spike_sorting_raw",
],
)


def _get_session_paths_openmind(subject, session):
"""Get paths to all components of the data on openmind."""
subject_id = SUBJECT_NAME_TO_ID[subject]

# Path to write output nwb files to
output_path = f"/om/user/nwatters/nwb_data_multi_prediction/staging/sub-{subject}"

# Path to the raw data. This is used for reading raw physiology data.
raw_data_path = f"/om4/group/jazlab/nwatters/multi_prediction/phys_data/{subject}/" f"{session}/raw_data"

# Path to task and behavior data.
task_behavior_data_path = (
"/om4/group/jazlab/nwatters/multi_prediction/datasets/data_nwb_trials/" f"{subject}/{session}"
)

# Path to open-source data. This is used for reading behavior and task data.
data_open_source_path = (
"/om4/group/jazlab/nwatters/multi_prediction/datasets/data_open_source/" f"Subjects/{subject_id}/{session}/001"
)

# Path to sync pulses. This is used for reading timescale transformations
# between physiology and mworks data streams.
sync_pulses_path = "/om4/group/jazlab/nwatters/multi_prediction/data_processed/" f"{subject}/{session}/sync_pulses"

# Path to spike sorting. This is used for reading spike sorted data.
spike_sorting_raw_path = (
f"/om4/group/jazlab/nwatters/multi_prediction/phys_data/{subject}/" f"{session}/spike_sorting"
)

session_paths = SessionPaths(
output=pathlib.Path(output_path),
raw_data=pathlib.Path(raw_data_path),
data_open_source=pathlib.Path(data_open_source_path),
task_behavior_data=pathlib.Path(task_behavior_data_path),
sync_pulses=pathlib.Path(sync_pulses_path),
spike_sorting_raw=pathlib.Path(spike_sorting_raw_path),
)

return session_paths


def _get_session_paths_globus(subject, session):
"""Get paths to all components of the data in the globus repo."""
subject_id = SUBJECT_NAME_TO_ID[subject]
base_data_dir = f"/shared/catalystneuro/JazLab/{subject_id}/{session}/"

# Path to write output nwb files to
output_path = f"~/conversion_nwb/jazayeri-lab-to-nwb"

# Path to the raw data. This is used for reading raw physiology data.
raw_data_path = f"{base_data_dir}/raw_data"

# Path to task and behavior data.
task_behavior_data_path = f"{base_data_dir}/processed_task_data"

# Path to open-source data. This is used for reading behavior and task data.
data_open_source_path = f"{base_data_dir}/data_open_source"

# Path to sync pulses. This is used for reading timescale transformations
# between physiology and mworks data streams.
sync_pulses_path = f"{base_data_dir}/sync_pulses"

# Path to spike sorting. This is used for reading spike sorted data.
spike_sorting_raw_path = f"{base_data_dir}/spike_sorting"

session_paths = SessionPaths(
output=pathlib.Path(output_path),
raw_data=pathlib.Path(raw_data_path),
data_open_source=pathlib.Path(data_open_source_path),
task_behavior_data=pathlib.Path(task_behavior_data_path),
sync_pulses=pathlib.Path(sync_pulses_path),
spike_sorting_raw=pathlib.Path(spike_sorting_raw_path),
)

return session_paths


def get_session_paths(subject, session, repo="openmind"):
"""Get paths to all components of the data.

Returns:
SessionPaths namedtuple.
"""
if repo == "openmind":
return _get_session_paths_openmind(subject=subject, session=session)
elif repo == "globus":
return _get_session_paths_globus(subject=subject, session=session)
else:
raise ValueError(f"Invalid repo {repo}")
Loading