Support remfile and file-like in nwb rec extractor #2169

magland · 2023-11-04T18:45:57Z

Adds "remfile" stream mode option for NwbRecordingExtractor.

Also allows passing in a file-like object. This is important for dendro for embargoed datasets because we need to pass in a file-like object that is capable of renewing its remote url periodically as the presigned download url expires.

In terms of parameter order, I put the new parameter "file" after electrical_series_name because somebody may be calling this without naming the arguments (since * is not used in the constructor definition).

This is untested (are there unit tests for nwb extractor?)

@alejoe91 @samuelgarcia @luiztauffer

for more information, see https://pre-commit.ci

alejoe91 · 2023-11-04T18:49:23Z

Thanks @magland

You can add streaming tests here: https://github.com/SpikeInterface/spikeinterface/blob/main/src/spikeinterface/extractors/tests/test_nwb_s3_extractor.py

… into nwb-remfile

for more information, see https://pre-commit.ci

magland · 2023-11-04T19:01:12Z

Thanks @alejoe91 I added some tests. Let's see if they pass!

for more information, see https://pre-commit.ci

src/spikeinterface/extractors/nwbextractors.py

h-mayorquin

You need to add file to the kwargs for pickling to work.

Concerning this:

Also allows passing in a file-like object. This is important for dendro for embargoed datasets because we need to pass in a file-like object that is capable of renewing its remote url periodically as the presigned download url expires.

I am curios, why can't you get away by just re-initializing the object when the path changes.

# Condition / loop here to see if the path did not change
try:
    file_path = (
        "https://dandi-api-staging-dandisets.s3.amazonaws.com/blobs/5f4/b7a/5f4b7a1f-7b95-4ad8-9579-4df6025371cc"
    )
    file = remfile.File(file_path)
# File path should correct here    
    rec = NwbRecordingExtractor(file_path=file_path)

In other words, why just modifying the url (the file_path) is not enough?

src/spikeinterface/extractors/tests/test_nwb_s3_extractor.py

magland · 2023-11-06T12:25:15Z

Thanks @h-mayorquin

I am curios, why can't you get away by just re-initializing the object when the path changes.
In other words, why just modifying the url (the file_path) is not enough?

It's because the recording object needs to be passed into other functions which may take a long time to execute. For example

f = some_file_object_that_renews_the_url()
rec = NwbRecordingExtractor(file=f, ...)
some_package.some_long_analysis(recording=rec, ...)

where we don't have control over some_long_analysis.

for more information, see https://pre-commit.ci

samuelgarcia · 2023-11-07T12:10:43Z

Merci beaucoup!

src/spikeinterface/extractors/nwbextractors.py

alejoe91 · 2023-11-08T11:50:01Z

src/spikeinterface/extractors/nwbextractors.py

        if stream_mode == "fsspec":
            # only add stream_cache_path to kwargs if it was passed as an argument
            if stream_cache_path is not None:
                stream_cache_path = str(Path(self.stream_cache_path).absolute())

        self.extra_requirements.extend(["pandas", "pynwb", "hdmf"])
        self._electrical_series = electrical_series
+
+        # set serializability bools
+        # TODO: correct spelling of self._serializablility throughout SI


solved here #2238

zm711

I just added a couple docstring changes/clarifications. Feel free to use if you want.

src/spikeinterface/extractors/nwbextractors.py

Co-authored-by: Zach McKenzie <[email protected]>

magland · 2023-11-08T13:13:31Z

I just added a couple docstring changes/clarifications. Feel free to use if you want.

Thanks, I accepted those commits. I wonder if we want to squash before merge. Maybe you don't care about large number of commits, IDK.

zm711 · 2023-11-08T20:13:22Z

Doesn't matter to me, whatever @alejoe91 and @samuelgarcia want when they merge.

h-mayorquin · 2023-11-09T08:32:13Z

@magland
Thanks for explaining and adding the tests. This looks good to me.

I also will like to highlight your comment about our lack of division between keyword and keyword only arguments. I think it will be more flexible in the future and avoid sub-optimal ordering as you have pointed out.

magland · 2023-11-09T12:26:23Z

I also will like to highlight your comment about our lack of division between keyword and keyword only arguments. I think it will be more flexible in the future and avoid sub-optimal ordering as you have pointed out.

Yes it's a potentially big problem if users do not use named kw args in their scripts, and then you want to add additional parameters for a function, which may break their code, unless you add them at the end; but it's not always desirable to do that. I recommend that * is used throughout except for possibly the first arg (and in rare instances the second as well).

zm711 · 2023-11-10T15:03:58Z

Yes it's a potentially big problem if users do not use named kw args in their scripts, and then you want to add additional parameters for a function, which may break their code, unless you add them at the end

This came up in #1800 when parameter order was swapped. Just to link an example of when this caused a problem. And in support of using keyword arguments.

alejoe91 · 2023-11-22T10:14:47Z

@magland is this ready to merge?

magland · 2023-11-22T13:14:06Z

@magland is this ready to merge?

@alejoe91 I believe so. But I am not using it yet, since I am working off of the released version of SI.

h-mayorquin · 2023-11-22T13:16:26Z

I can actually give it a try this afternoon as I need to do some reading from dandi.

h-mayorquin · 2023-11-23T13:12:32Z

Hi, I tried to test this yesterday but got a couple of bugs. I will fix the others as they are related to spikeinterface but there are some files I can't load with remfile:

For asset path:
'sub-PL015/sub-PL015_ses-1d4a7bd6-296a-48b9-b20e-bd0ac80750a5_behavior+ecephys+image.nwb'

Of dandiset id:
000409

The ulr of the blob:
https://dandiarchive.s3.amazonaws.com/blobs/413/cf0/413cf0f3-3498-485a-b099-84bc36d43ca6

This works:

import fsspec
import h5py
from fsspec.implementations.cached import CachingFileSystem
from pathlib import Path

# URL of the file
file_url = 'https://dandiarchive.s3.amazonaws.com/blobs/413/cf0/413cf0f3-3498-485a-b099-84bc36d43ca6'


fs = fsspec.filesystem("http", )
fsspec_file = fs.open(file_url, "rb")


# Use h5py to open the cached file
file = h5py.File(fsspec_file, 'r') 

nwbfile = NWBHDF5IO(file=file, mode='r', load_namespaces=True).read()
nwbfile

But this does not :

import remfile

import fsspec
import h5py
from fsspec.implementations.cached import CachingFileSystem
from pathlib import Path

# URL of the file
file_url = 'https://dandiarchive.s3.amazonaws.com/blobs/413/cf0/413cf0f3-3498-485a-b099-84bc36d43ca6'

import remfile
import h5py
from pynwb import NWBHDF5IO

# Open the file with fsspec
rfile = remfile.File(file_url)

file = h5py.File(rfile, 'r')

nwbfile = NWBHDF5IO(file, 'r', load_namespaces=True).read()

The output is:

{
	"name": "OSError",
	"message": "Unable to open file (file signature not found)",
	"stack": "---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/home/heberto/development/spikeinterface/bin/dev_notebook.ipynb Cell 35 line 1
     <a href='vscode-notebook-cell:/home/heberto/development/spikeinterface/bin/dev_notebook.ipynb#X25sZmlsZQ%3D%3D?line=13'>14</a> cached_file = caching_file_system.open(path=file_url, mode=\"rb\")
     <a href='vscode-notebook-cell:/home/heberto/development/spikeinterface/bin/dev_notebook.ipynb#X25sZmlsZQ%3D%3D?line=14'>15</a> # Use h5py to open the cached file
---> <a href='vscode-notebook-cell:/home/heberto/development/spikeinterface/bin/dev_notebook.ipynb#X25sZmlsZQ%3D%3D?line=16'>17</a> file = h5py.File(cached_file, 'r') 
     <a href='vscode-notebook-cell:/home/heberto/development/spikeinterface/bin/dev_notebook.ipynb#X25sZmlsZQ%3D%3D?line=18'>19</a> nwbfile = NWBHDF5IO(file=file, mode='r', load_namespaces=True).read()
     <a href='vscode-notebook-cell:/home/heberto/development/spikeinterface/bin/dev_notebook.ipynb#X25sZmlsZQ%3D%3D?line=19'>20</a> nwbfile

File ~/miniconda3/envs/neuroconv_env/lib/python3.10/site-packages/h5py/_hl/files.py:562, in File.__init__(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, fs_strategy, fs_persist, fs_threshold, fs_page_size, page_buf_size, min_meta_keep, min_raw_keep, locking, alignment_threshold, alignment_interval, meta_block_size, **kwds)
    553     fapl = make_fapl(driver, libver, rdcc_nslots, rdcc_nbytes, rdcc_w0,
    554                      locking, page_buf_size, min_meta_keep, min_raw_keep,
    555                      alignment_threshold=alignment_threshold,
    556                      alignment_interval=alignment_interval,
    557                      meta_block_size=meta_block_size,
    558                      **kwds)
    559     fcpl = make_fcpl(track_order=track_order, fs_strategy=fs_strategy,
    560                      fs_persist=fs_persist, fs_threshold=fs_threshold,
    561                      fs_page_size=fs_page_size)
--> 562     fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
    564 if isinstance(libver, tuple):
    565     self._libver = libver

File ~/miniconda3/envs/neuroconv_env/lib/python3.10/site-packages/h5py/_hl/files.py:235, in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
    233     if swmr and swmr_support:
    234         flags |= h5f.ACC_SWMR_READ
--> 235     fid = h5f.open(name, flags, fapl=fapl)
    236 elif mode == 'r+':
    237     fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File h5py/h5f.pyx:106, in h5py.h5f.open()

OSError: Unable to open file (file signature not found)"
}

Note that remfile shares this problem fsspec when the caching file is used so that might offer a clue. That is, the following produces a very similar error:

import fsspec
import h5py
from fsspec.implementations.cached import CachingFileSystem
from pynwb import NWBHDF5IO

file_url = 'https://dandiarchive.s3.amazonaws.com/blobs/413/cf0/413cf0f3-3498-485a-b099-84bc36d43ca6'


caching_file_system = CachingFileSystem(
    fs=fsspec.filesystem("http", ),
    cache_storage=str(Path.cwd()),
)

cached_file = caching_file_system.open(path=file_url, mode="rb")
# Use h5py to open the cached file

file = h5py.File(cached_file, 'r') 

nwbfile = NWBHDF5IO(file=file, mode='r', load_namespaces=True).read()
nwbfile

I am using h5py 3.10.0 and the latest version of request.
I tried a couple of different versions of h5py to no avail.

magland · 2023-11-23T13:52:40Z

@h-mayorquin

I tried replacing

nwbfile = NWBHDF5IO(file, 'r', load_namespaces=True).read()

in your script with

nwbfile = NWBHDF5IO(file=file, mode='r', load_namespaces=True).read()

and it seemed to work.

Then I did the following timing test

import remfile
import time

import fsspec
import h5py
from pynwb import NWBHDF5IO

# URL of the file
file_url = 'https://dandiarchive.s3.amazonaws.com/blobs/413/cf0/413cf0f3-3498-485a-b099-84bc36d43ca6'

for mode in ['fsspec', 'remfile']:
    if mode == 'fsspec':
        print('fspec mode.......')
        timer = time.time()
        fs = fsspec.filesystem("http", )
        fsspec_file = fs.open(file_url, "rb")

        # Use h5py to open the cached file
        file = h5py.File(fsspec_file, 'r')

        print(file.keys())

        nwbfile = NWBHDF5IO(file=file, mode='r', load_namespaces=True).read()
        print(nwbfile)

        elapsed_sec = time.time() - timer
        print(f'Elapsed time for fsspec mode: {elapsed_sec:.2f} sec')
    elif mode == 'remfile':
        print('remfile mode.......')
        timer = time.time()
        rfile = remfile.File(file_url, verbose=True)

        file = h5py.File(rfile, 'r')

        print(file.keys())

        nwbfile = NWBHDF5IO(file=file, mode='r', load_namespaces=True).read()
        print(nwbfile)

        elapsed_sec = time.time() - timer
        print(f'Elapsed time for remfile mode: {elapsed_sec:.2f} sec')

And I got

fspec mode.......
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'intervals', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>
/home/magland/miniconda3/envs/dev/lib/python3.8/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.5.1 because version 1.8.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/magland/miniconda3/envs/dev/lib/python3.8/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'core' version 2.5.0 because version 2.6.0-alpha is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/magland/miniconda3/envs/dev/lib/python3.8/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-experimental' version 0.2.0 because version 0.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
root pynwb.file.NWBFile at 0x140179901711552
Fields:
  acquisition: {
    ElectricalSeriesAp <class 'pynwb.ecephys.ElectricalSeries'>,
    ElectricalSeriesLf <class 'pynwb.ecephys.ElectricalSeries'>,
    OriginalVideoBodyCamera <class 'pynwb.image.ImageSeries'>,
    OriginalVideoLeftCamera <class 'pynwb.image.ImageSeries'>,
    OriginalVideoRightCamera <class 'pynwb.image.ImageSeries'>
  }
  devices: {
    NeuropixelsProbe <class 'pynwb.device.Device'>
  }
  electrode_groups: {
    NeuropixelsShank <class 'pynwb.ecephys.ElectrodeGroup'>
  }
  electrodes: electrodes <class 'hdmf.common.table.DynamicTable'>
  experiment_description: IBL aims to understand the neural basis of decision-making in the mouse by gathering a whole-brain activity map composed of electrophysiological recordings pooled from multiple laboratories. We have systematically recorded from nearly all major brain areas with Neuropixels probes, using a grid system for unbiased sampling and replicating each recording site in at least two laboratories. These data have been used to construct a brain-wide map of activity at single-spike cellular resolution during a decision-making task. In addition to the map, this data set contains other information gathered during the task: sensory stimuli presented to the mouse; mouse decisions and response times; and mouse pose information from video recordings and DeepLabCut analysis.

  file_create_date: [datetime.datetime(2023, 2, 17, 3, 15, 23, 896756, tzinfo=tzutc())]
  identifier: c33e2740-5475-463e-bd16-d1c38da37463
  institution: University College London
  intervals: {
    trials <class 'pynwb.epoch.TimeIntervals'>
  }
  lab: Hausser
  processing: {
    behavior <class 'pynwb.base.ProcessingModule'>
  }
  protocol: _iblrig_tasks_ephysChoiceWorld6.6.1
  related_publications: ['https://doi.org/10.6084/m9.figshare.21400815.v6, https://doi.org/10.1101/2020.01.17.909838']
  session_description: The full description of the session/task protocol can be found in Appendix 2 of International Brain Laboratory, et al. "Standardized and reproducible measurement of decision-making in mice." Elife 10 (2021); e63711.

  session_id: 1d4a7bd6-296a-48b9-b20e-bd0ac80750a5
  session_start_time: 2022-07-21 16:08:53.428769+01:00
  subject: subject abc.IblSubject at 0x140179901711936
Fields:
  date_of_birth: 2021-05-15 00:00:00+01:00
  description: Mice were housed under a 12/12 h light/dark cycle (normal or inverted depending on the laboratory) with food and water 112 available ad libitum, except during behavioural training days. Electrophysiological recordings and behavioural training were performed during either the dark or light phase of the subject cycle depending on the laboratory. Subjects were obtained from either the Jackson Laboratory or Charles River.

  expected_water_ml: 1.1400000000000001
  last_water_restriction: 2021-06-01T17:42:13
  remaining_water_ml: 1.1400000000000001
  sex: M
  species: Mus musculus
  strain: C57BL/6
  subject_id: PL015
  url: https://openalyx.internationalbrainlab.org/subjects/PL015
  weight: 0.030100000000000002 kg

  timestamps_reference_time: 2022-07-21 16:08:53.428769+01:00
  trials: trials <class 'pynwb.epoch.TimeIntervals'>
  units: units <class 'pynwb.misc.Units'>

Elapsed time for fsspec mode: 393.38 sec
remfile mode.......
Loading 1 chunks starting at 0 (0.1024 million bytes)
Loading 1 chunks starting at 455542 (0.1024 million bytes)
Loading 1 chunks starting at 455838 (0.1024 million bytes)
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'intervals', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>
Loading 2 chunks starting at 455839 (0.2048 million bytes)
Loading 2 chunks starting at 455643 (0.2048 million bytes)
...
root pynwb.file.NWBFile at 0x140179901580528
Fields:
  acquisition: {
    ElectricalSeriesAp <class 'pynwb.ecephys.ElectricalSeries'>,
    ElectricalSeriesLf <class 'pynwb.ecephys.ElectricalSeries'>,
    OriginalVideoBodyCamera <class 'pynwb.image.ImageSeries'>,
    OriginalVideoLeftCamera <class 'pynwb.image.ImageSeries'>,
    OriginalVideoRightCamera <class 'pynwb.image.ImageSeries'>
  }
  devices: {
    NeuropixelsProbe <class 'pynwb.device.Device'>
  }
  electrode_groups: {
    NeuropixelsShank <class 'pynwb.ecephys.ElectrodeGroup'>
  }
  electrodes: electrodes <class 'hdmf.common.table.DynamicTable'>
  experiment_description: IBL aims to understand the neural basis of decision-making in the mouse by gathering a whole-brain activity map composed of electrophysiological recordings pooled from multiple laboratories. We have systematically recorded from nearly all major brain areas with Neuropixels probes, using a grid system for unbiased sampling and replicating each recording site in at least two laboratories. These data have been used to construct a brain-wide map of activity at single-spike cellular resolution during a decision-making task. In addition to the map, this data set contains other information gathered during the task: sensory stimuli presented to the mouse; mouse decisions and response times; and mouse pose information from video recordings and DeepLabCut analysis.

  file_create_date: [datetime.datetime(2023, 2, 17, 3, 15, 23, 896756, tzinfo=tzutc())]
  identifier: c33e2740-5475-463e-bd16-d1c38da37463
  institution: University College London
  intervals: {
    trials <class 'pynwb.epoch.TimeIntervals'>
  }
  lab: Hausser
  processing: {
    behavior <class 'pynwb.base.ProcessingModule'>
  }
  protocol: _iblrig_tasks_ephysChoiceWorld6.6.1
  related_publications: ['https://doi.org/10.6084/m9.figshare.21400815.v6, https://doi.org/10.1101/2020.01.17.909838']
  session_description: The full description of the session/task protocol can be found in Appendix 2 of International Brain Laboratory, et al. "Standardized and reproducible measurement of decision-making in mice." Elife 10 (2021); e63711.

  session_id: 1d4a7bd6-296a-48b9-b20e-bd0ac80750a5
  session_start_time: 2022-07-21 16:08:53.428769+01:00
  subject: subject abc.IblSubject at 0x140179904705152
Fields:
  date_of_birth: 2021-05-15 00:00:00+01:00
  description: Mice were housed under a 12/12 h light/dark cycle (normal or inverted depending on the laboratory) with food and water 112 available ad libitum, except during behavioural training days. Electrophysiological recordings and behavioural training were performed during either the dark or light phase of the subject cycle depending on the laboratory. Subjects were obtained from either the Jackson Laboratory or Charles River.

  expected_water_ml: 1.1400000000000001
  last_water_restriction: 2021-06-01T17:42:13
  remaining_water_ml: 1.1400000000000001
  sex: M
  species: Mus musculus
  strain: C57BL/6
  subject_id: PL015
  url: https://openalyx.internationalbrainlab.org/subjects/PL015
  weight: 0.030100000000000002 kg

  timestamps_reference_time: 2022-07-21 16:08:53.428769+01:00
  trials: trials <class 'pynwb.epoch.TimeIntervals'>
  units: units <class 'pynwb.misc.Units'>

Elapsed time for remfile mode: 44.25 sec

So the difference in timing is 393 sec for fsspec vs 44 sec for remfile!

I'll also note that the pynwb overhead is significant. You can begin to read the h5py file after just a second or two. That's why the nwb recording extractors I am using do not use pynwb at all, but go directly for the electrical series objects in the h5py file. I think in a separate PR we should explore using that method for the SI nwb recording extractor.

@alejoe91

h-mayorquin · 2023-11-23T14:09:24Z

@magland
Thanks, yes, I guess the it is only the caching system that is failing (see under). I have encountered an erro before passing the file before as a positional argument which was interpreted as a file_path (and generated the obscure error). Then I got the following error with the caching system and wrongly assumed it was the same with the remfile. Thanks.

import fsspec
import h5py
from fsspec.implementations.cached import CachingFileSystem
from pynwb import NWBHDF5IO

file_url = 'https://dandiarchive.s3.amazonaws.com/blobs/413/cf0/413cf0f3-3498-485a-b099-84bc36d43ca6'


caching_file_system = CachingFileSystem(
    fs=fsspec.filesystem("http", ),
    cache_storage=str(Path.cwd()),
)

cached_file = caching_file_system.open(path=file_url, mode="rb")
# Use h5py to open the cached file

file = h5py.File(cached_file, 'r')

Can you point to the extractors that you are using?

magland · 2023-11-23T14:13:45Z

Can you point to the extractors that you are using?

https://github.com/scratchrealm/pc-spike-sorting/blob/main/common/NwbRecording.py

It doesn't have all the features of the SI version, but it has the ones I need for spike sorting.

h-mayorquin · 2023-11-23T15:14:17Z

Thanks a bunch, I will check them out.
The only problem that I see witth that is that at some point nwbfile might be zarr as well... But I will probably be using something like this soon as I need to analyze a bunch of files from dandi and the nwb does add quite an overhead as you were saying.

samuelgarcia · 2023-11-23T20:21:12Z

Thanks a lot Jeremy and Ramon for figthing with this NWB streaming.
I would vote to integrate the NWB of Jeremy in spikeinterface with another name and make an option in read_nwb() to use the legacy or the fast nwb reader.
I think this could be done quickly.

h-mayorquin · 2023-11-24T10:25:31Z

@samuelgarcia

Yes, I think we should merge. I will also add this to the nwb sorting extractor in another PR.
I measured the same file than Jemery and I get similar results. I also tested against using a s3 protocol which is better than plain fsspec but not much better. The network usage profile just massively favours remfile as well:

-----------------------------------
fsspec mode.......
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'intervals', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>
Elapsed time for fsspec mode: 9.12 sec
Bytes Sent: 215,288, Bytes Received: 22,438,997
-----------------------------------
s3 mode.......
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'intervals', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>
Elapsed time for s3 mode: 8.73 sec
Bytes Sent: 227,577, Bytes Received: 22,434,127
-----------------------------------
remfile mode.......
Loading 1 chunks starting at 0 (0.1024 million bytes)
Loading 1 chunks starting at 455542 (0.1024 million bytes)
Loading 1 chunks starting at 455838 (0.1024 million bytes)
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'intervals', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>
Elapsed time for remfile mode: 3.25 sec
Bytes Sent: 29,851, Bytes Received: 395,810

Reading the NWB in fact clogs my system, I will dig deeper into this at some point and report it to the core team but we will probably need to do something along the lines of what Jeremy for his extractors for the Templates project. It just can't be done otherwise : /

Check out the network profile, the last small chunk is the remfile:

I will keep playing with this.

magland · 2023-11-24T13:02:43Z

Wow thanks @h-mayorquin ! The network activity plot is pretty convincing.

I just created an issue on pynwb to update their docs to suggest remfile as the preferred method.

magland and others added 2 commits November 4, 2023 14:38

support remfile and file-like in nwb rec extractor

ea114d9

[pre-commit.ci] auto fixes from pre-commit.com hooks

7eed738

for more information, see https://pre-commit.ci

magland marked this pull request as draft November 4, 2023 18:47

magland and others added 3 commits November 4, 2023 14:59

add remfile nwb tests

193392a

Merge branch 'nwb-remfile' of https://github.com/magland/spikeinterface…

09d273d

… into nwb-remfile

[pre-commit.ci] auto fixes from pre-commit.com hooks

5ae4c90

for more information, see https://pre-commit.ci

magland and others added 3 commits November 4, 2023 15:03

fix remfile streaming tests

de9639b

fix nwb remfile

3a7de4d

[pre-commit.ci] auto fixes from pre-commit.com hooks

c7b0f08

for more information, see https://pre-commit.ci

h-mayorquin reviewed Nov 5, 2023

View reviewed changes

src/spikeinterface/extractors/nwbextractors.py Outdated Show resolved Hide resolved

src/spikeinterface/extractors/nwbextractors.py Outdated Show resolved Hide resolved

h-mayorquin reviewed Nov 5, 2023

View reviewed changes

src/spikeinterface/extractors/tests/test_nwb_s3_extractor.py Show resolved Hide resolved

alejoe91 added the extractors Related to extractors module label Nov 5, 2023

alejoe91 changed the title ~~support remfile and file-like in nwb rec extractor~~ Support remfile and file-like in nwb rec extractor Nov 6, 2023

magland and others added 2 commits November 6, 2023 07:27

adjust nwbrecordingextractor based on review

2b15793

[pre-commit.ci] auto fixes from pre-commit.com hooks

703ef6f

for more information, see https://pre-commit.ci

alejoe91 reviewed Nov 8, 2023

View reviewed changes

src/spikeinterface/extractors/nwbextractors.py Show resolved Hide resolved

set serializability for nwb rec extractor

96384cd

alejoe91 reviewed Nov 8, 2023

View reviewed changes

zm711 reviewed Nov 8, 2023

View reviewed changes

magland and others added 3 commits November 8, 2023 08:09

Update src/spikeinterface/extractors/nwbextractors.py

8fc89af

Co-authored-by: Zach McKenzie <[email protected]>

Update src/spikeinterface/extractors/nwbextractors.py

3139fde

Co-authored-by: Zach McKenzie <[email protected]>

Update src/spikeinterface/extractors/nwbextractors.py

5f66853

Co-authored-by: Zach McKenzie <[email protected]>

alejoe91 marked this pull request as ready for review November 15, 2023 16:11

Merge branch 'main' into nwb-remfile

65cd12e

h-mayorquin mentioned this pull request Nov 23, 2023

Add option for no caching option to the NWBRecordingExtractor when streaming #2246

Merged

magland mentioned this pull request Nov 24, 2023

[Documentation]: Streaming NWB files - recommend using remfile as the preferred method NeurodataWithoutBorders/pynwb#1791

Closed

3 tasks

Solve conflicts

273766b

alejoe91 approved these changes Nov 27, 2023

View reviewed changes

alejoe91 merged commit ee88ef3 into SpikeInterface:main Nov 27, 2023
11 checks passed

h-mayorquin mentioned this pull request Nov 30, 2023

Add nwb sorting rem file support #2275

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support remfile and file-like in nwb rec extractor #2169

Support remfile and file-like in nwb rec extractor #2169

magland commented Nov 4, 2023

alejoe91 commented Nov 4, 2023

magland commented Nov 4, 2023

h-mayorquin left a comment •

edited

Loading

magland commented Nov 6, 2023

samuelgarcia commented Nov 7, 2023

alejoe91 Nov 8, 2023

alejoe91 Nov 27, 2023

zm711 left a comment

magland commented Nov 8, 2023

zm711 commented Nov 8, 2023

h-mayorquin commented Nov 9, 2023

magland commented Nov 9, 2023

zm711 commented Nov 10, 2023

alejoe91 commented Nov 22, 2023

magland commented Nov 22, 2023

h-mayorquin commented Nov 22, 2023

h-mayorquin commented Nov 23, 2023

magland commented Nov 23, 2023

h-mayorquin commented Nov 23, 2023

magland commented Nov 23, 2023

h-mayorquin commented Nov 23, 2023

samuelgarcia commented Nov 23, 2023

h-mayorquin commented Nov 24, 2023

magland commented Nov 24, 2023

Support remfile and file-like in nwb rec extractor #2169

Support remfile and file-like in nwb rec extractor #2169

Conversation

magland commented Nov 4, 2023

alejoe91 commented Nov 4, 2023

magland commented Nov 4, 2023

h-mayorquin left a comment • edited Loading

Choose a reason for hiding this comment

magland commented Nov 6, 2023

samuelgarcia commented Nov 7, 2023

alejoe91 Nov 8, 2023

Choose a reason for hiding this comment

alejoe91 Nov 27, 2023

Choose a reason for hiding this comment

zm711 left a comment

Choose a reason for hiding this comment

magland commented Nov 8, 2023

zm711 commented Nov 8, 2023

h-mayorquin commented Nov 9, 2023

magland commented Nov 9, 2023

zm711 commented Nov 10, 2023

alejoe91 commented Nov 22, 2023

magland commented Nov 22, 2023

h-mayorquin commented Nov 22, 2023

h-mayorquin commented Nov 23, 2023

magland commented Nov 23, 2023

h-mayorquin commented Nov 23, 2023

magland commented Nov 23, 2023

h-mayorquin commented Nov 23, 2023

samuelgarcia commented Nov 23, 2023

h-mayorquin commented Nov 24, 2023

magland commented Nov 24, 2023

h-mayorquin left a comment •

edited

Loading