Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aída's conversion directory #19

Draft
wants to merge 32 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
13d08a5
Created new folder and update get_session_paths
aidapiccato Dec 19, 2023
981faf5
Everything up to line 280 in main_convert_session works. Need to gene…
aidapiccato Dec 19, 2023
c280948
reading in display and timeseries interface data
aidapiccato Jan 2, 2024
fd88d6d
Display interface is now functional
aidapiccato Jan 2, 2024
daebee0
Entire conversion pipeline seems to run!
aidapiccato Jan 3, 2024
258d7dd
Remove electrical series from processed data
aidapiccato Jan 3, 2024
1576758
Delete src/jazayeri_lab_to_nwb/piccato/logs directory
aidapiccato Jan 3, 2024
385ea8e
Upload to DANDI; adding to .gitignore; reformatting
aidapiccato Jan 3, 2024
6ea3956
Merge branch 'piccato' of github.com:catalystneuro/jazayeri-lab-to-nw…
aidapiccato Jan 3, 2024
4613fb3
Small formatting changes
aidapiccato Jan 3, 2024
8d5cee0
undoing accidental changes to watters directory
aidapiccato Jan 3, 2024
e435ead
Added stimulus set as a field in trials_interface; small changes when…
aidapiccato Jan 4, 2024
514751a
Included dandiset directory when writing to staging, removed automati…
aidapiccato Jan 7, 2024
4114dc7
Updated readme to reflect directory structure; otherwise mostly forma…
aidapiccato Jan 8, 2024
c3cee5b
Small changes to conversion
aidapiccato Jan 10, 2024
6fa2466
Tiny formatting changes to prepare for merge
aidapiccato Jan 10, 2024
e08dbb3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 10, 2024
d3cfadc
made session_id a string
aidapiccato Jan 16, 2024
dfd9da6
Added option to exclude non-trial-structured data from file (to see i…
aidapiccato Jan 18, 2024
1daa703
Final small refactoring changes
aidapiccato Jan 22, 2024
bf8c061
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 22, 2024
05974c5
Updated neuroconv requirement
aidapiccato Jan 22, 2024
4a1b624
small
aidapiccato Jan 30, 2024
a422e07
Merge branch 'piccato' of github.com:catalystneuro/jazayeri-lab-to-nw…
aidapiccato Feb 6, 2024
f336f1f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 6, 2024
3c4be41
Merge branch 'main' into piccato
CodyCBakerPhD Feb 16, 2024
502e988
split up conversion
aidapiccato Feb 26, 2024
101abe6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 26, 2024
ffea6ab
added postprocessing data
aidapiccato Feb 26, 2024
0c4ebbf
editing timestamp
aidapiccato Mar 15, 2024
2064c95
merge conflicts
aidapiccato Mar 15, 2024
b23703b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -147,3 +147,5 @@ dmypy.json

# NWB files
**.nwb

*.out
50 changes: 50 additions & 0 deletions src/jazayeri_lab_to_nwb/piccato/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Piccato data conversion pipeline
NWB conversion scripts for Piccato data to the [Neurodata Without Borders](https://nwb-overview.readthedocs.io/) data format.


## Usage
To run a specific conversion, you might need to install first some conversion specific dependencies that are located in each conversion directory:

```
pip install -r src/jazayeri_lab_to_nwb/piccato/requirements.txt
```

You can run a specific conversion with the following command:
```
python src/jazayeri_lab_to_nwb/piccato/main_convert_session.py $SUBJECT $SESSION
```

### Piccato working and long-term memory task data
The conversion function for this experiment, `session_to_nwb`, is found in `src/piccato/main_convert_session.py`. The function takes arguments:
* `subject` subject name (currently only `'elgar'`.)
* `session` session date in format `'YYYY-MM-DD'`.
* `stub_test` indicates whether only a small portion of the data should be saved (mainly used by us for testing purposes).
* `overwrite` indicates whether to overwrite nwb output files.

The function can be imported in a separate script with and run, or you can run the file directly and specify the arguments in the `if name == "__main__"` block at the bottom.

The function expects the raw data in `data_dir_path` to follow this structure:
```
data_dir_path/
├── behavior_task
│ ├── eye.h.json, eye.v.json, etc.
│ ├── trials.json
├── raw_data
│ ├── behavior
│ └── mworks
│ └── moog
│ ├── spikeglx
│ └── */*/*.ap.bin, */*/*.lf.bin, etc.
├── spike_sorting
│ ├── spikeglx
│ └── kilosort2_5_0
├── phys_metadata.json
├── sync_signals
└── spikeglx
└── transform
```
The conversion will try to automatically fetch metadata from the provided data directory. However, some information, such as the subject's name and age, must be specified by the user in the file `src/jazayeri_lab_to_nwb/piccato/metadata.yaml`. If any of the automatically fetched metadata is incorrect, it can also be overriden from this file.

The converted data will be saved in two files, one called `{session_id}_ecephys.nwb`, which contains the raw electrophysiology data from the Neuropixels and V-Probes, and one called `{session_id}_behavior+ecephys.nwb` with behavioral data, trial info, and sorted unit spiking.

If you run into memory issues when writing the `{session_id}_ecephys.nwb` files, you may want to set `buffer_gb` to a value smaller than 1 (its default) in the `conversion_options` dicts for the recording interfaces, i.e. [here](https://github.com/catalystneuro/jazayeri-lab-to-nwb/blob/vprobe_dev/src/jazayeri_lab_to_nwb/watters/main_convert_session.py#L189).
Empty file.
86 changes: 86 additions & 0 deletions src/jazayeri_lab_to_nwb/piccato/display_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
"""Class for converting data about display frames."""

import itertools
import json
from pathlib import Path
from typing import Optional

import numpy as np
import pandas as pd
from neuroconv.datainterfaces.text.timeintervalsinterface import (
TimeIntervalsInterface,
)
from neuroconv.utils import FolderPathType
from pynwb import NWBFile


class DisplayInterface(TimeIntervalsInterface):
"""Class for converting data about display frames.

All events that occur exactly once per display update are contained in this
interface.
"""

KEY_MAP = {
"frame_closed_loop_gaze_position": "closed_loop_eye_position",
"frame_task_phase": "task_phase",
"frame_display_times": "start_time",
}

def __init__(self, folder_path: FolderPathType, verbose: bool = True):
super().__init__(file_path=folder_path, verbose=verbose)

def get_metadata(self) -> dict:
metadata = super().get_metadata()
metadata["TimeIntervals"] = dict(
display=dict(
table_name="display",
table_description="data about each displayed frame",
)
)
return metadata

def get_timestamps(self) -> np.ndarray:
return super(DisplayInterface, self).get_timestamps(
column="start_time"
)

def set_aligned_starting_time(self, aligned_starting_time: float) -> None:
self.dataframe.start_time += aligned_starting_time

def _read_file(self, file_path: FolderPathType):
# Create dataframe with data for each frame
trials = json.load(open(Path(file_path) / "trials.json", "r"))
frames = {
k_mapped: list(itertools.chain(*[d[k] for d in trials]))
for k, k_mapped in DisplayInterface.KEY_MAP.items()
}

return pd.DataFrame(frames)

def add_to_nwbfile(
self,
nwbfile: NWBFile,
metadata: Optional[dict] = None,
tag: str = "display",
):
return super(DisplayInterface, self).add_to_nwbfile(
nwbfile=nwbfile,
metadata=metadata,
tag=tag,
column_descriptions=self.column_descriptions,
)

@property
def column_descriptions(self):
column_descriptions = {
"closed_loop_eye_position": (
"For each frame, the eye position in the close-loop task "
"engine. This was used to for real-time eye position "
"computations, such as saccade detection and reward delivery."
),
"task_phase": "The phase of the task for each frame.",
"start_time": "Time of display update for each frame.",
}

return column_descriptions
84 changes: 84 additions & 0 deletions src/jazayeri_lab_to_nwb/piccato/get_session_paths.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""Function for getting paths to data on openmind."""

import collections
import pathlib

SUBJECT_NAME_TO_ID = {
"elgar": "elgar",
}

OM_PATH = "/om2/user/apiccato/phys_preprocessing_open_source/phys_data"
DANDISET_ID = "000767"
SessionPaths = collections.namedtuple(
"SessionPaths",
[
"output",
"ecephys_data",
"behavior_task_data",
"session_data",
"sync_pulses",
"spike_sorting_raw",
"postprocessed_data",
],
)


def _get_session_paths_openmind(subject, session):
"""Get paths to all components of the data on openmind."""
# subject_id = SUBJECT_NAME_TO_ID[subject]

# Path to write output nwb files to
output_path = pathlib.Path(
f"/om2/user/apiccato/nwb_data/staging/{DANDISET_ID}/sub-{subject}"
)

# Path to the raw data. This is used for reading raw physiology data.
ecephys_data_path = pathlib.Path(
f"{OM_PATH}/{subject}/{session}/raw_data/"
)

# Path to task and behavior data.
behavior_task_data_path = pathlib.Path(
f"{OM_PATH}/{subject}/{session}/behavior_task"
)

# Path to sync pulses. This is used for reading timescale transformations
# between physiology and mworks data streams.
sync_pulses_path = pathlib.Path(
f"{OM_PATH}/{subject}/{session}/sync_signals"
)

# Path to spike sorting. This is used for reading spike sorted data.
spike_sorting_raw_path = pathlib.Path(
f"{OM_PATH}/{subject}/{session}/spike_sorting"
)

session_path = pathlib.Path(f"{OM_PATH}/{subject}/{session}/")

postprocessed_data_path = pathlib.Path(
f"{OM_PATH}/{subject}/{session}/kilosort2_5_0"
)

session_paths = SessionPaths(
output=output_path,
ecephys_data=ecephys_data_path,
session_data=session_path,
behavior_task_data=pathlib.Path(behavior_task_data_path),
sync_pulses=sync_pulses_path,
spike_sorting_raw=spike_sorting_raw_path,
postprocessed_data=postprocessed_data_path,
)

return session_paths


def get_session_paths(subject, session, repo="openmind"):
"""Get paths to all components of the data.

Returns:
SessionPaths namedtuple.
"""
if repo == "openmind":
return _get_session_paths_openmind(subject=subject, session=session)
else:
raise ValueError(f"Invalid repo {repo}")
Loading