Skip to content

Commit

Permalink
Merge branch 'dev' into streaming_add_remfile2
Browse files Browse the repository at this point in the history
  • Loading branch information
bendichter authored Nov 27, 2023
2 parents d34a526 + 9fafde2 commit 2a70ca9
Show file tree
Hide file tree
Showing 10 changed files with 97 additions and 104 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## PyNWB 2.6.0 (Upcoming)

### Enhancements and minor changes
- For `NWBHDF5IO()`, change the default of arg `load_namespaces` from `False` to `True`. @bendichter [#1748](https://github.com/NeurodataWithoutBorders/pynwb/pull/1748)
- Add `NWBHDF5IO.can_read()`. @bendichter [#1703](https://github.com/NeurodataWithoutBorders/pynwb/pull/1703)
- Add `pynwb.get_nwbfile_version()`. @bendichter [#1703](https://github.com/NeurodataWithoutBorders/pynwb/pull/1703)

Expand Down
81 changes: 37 additions & 44 deletions docs/gallery/advanced_io/linking_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,57 +6,50 @@
PyNWB supports linking between files using external links.
"""
Example Use Case: Integrating data from multiple files
---------------------------------------------------------
####################
# Example Use Case: Integrating data from multiple files
# ---------------------------------------------------------
#
# NBWContainer classes (e.g., :py:class:`~pynwb.base.TimeSeries`) support the integration of data stored in external
# HDF5 files with NWB data files via external links. To make things more concrete, let's look at the following use
# case. We want to simultaneously record multiple data streams during data acquisition. Using the concept of external
# links allows us to save each data stream to an external HDF5 files during data acquisition and to
# afterwards link the data into a single NWB:N file. In this case, each recording becomes represented by a
# separate file-system object that can be set as read-only once the experiment is done. In the following
# we are using :py:meth:`~pynwb.base.TimeSeries` as an example, but the same approach works for other
# NWBContainers as well.
#
NBWContainer classes (e.g., :py:class:`~pynwb.base.TimeSeries`) support the integration of data stored in external
HDF5 files with NWB data files via external links. To make things more concrete, let's look at the following use
case. We want to simultaneously record multiple data streams during data acquisition. Using the concept of external
links allows us to save each data stream to an external HDF5 files during data acquisition and to
afterwards link the data into a single NWB file. In this case, each recording becomes represented by a
separate file-system object that can be set as read-only once the experiment is done. In the following
we are using :py:meth:`~pynwb.base.TimeSeries` as an example, but the same approach works for other
NWBContainers as well.
####################
# .. tip::
#
# The same strategies we use here for creating External Links also apply to Soft Links.
# The main difference between soft and external links is that soft links point to other
# objects within the same file while external links point to objects in external files.
#
.. tip::
####################
# .. tip::
#
# In the case of :py:meth:`~pynwb.base.TimeSeries`, the uncorrected timestamps generated by the acquisition
# system can be stored (or linked) in the *sync* group. In the NWB:N format, hardware-recorded time data
# must then be corrected to a common time base (e.g., timestamps from all hardware sources aligned) before
# it can be included in the *timestamps* of the *TimeSeries*. This means, in the case
# of :py:meth:`~pynwb.base.TimeSeries` we need to be careful that we are not including data with incompatible
# timestamps in the same file when using external links.
#
The same strategies we use here for creating External Links also apply to Soft Links.
The main difference between soft and external links is that soft links point to other
objects within the same file while external links point to objects in external files.
####################
# .. warning::
#
# External links can become stale/break. Since external links are pointing to data in other files
# external links may become invalid any time files are modified on the file system, e.g., renamed,
# moved or access permissions are changed.
#
.. tip::
####################
# Creating test data
# ---------------------------
#
# In the following we are creating two :py:meth:`~pynwb.base.TimeSeries` each written to a separate file.
# We then show how we can integrate these files into a single NWBFile.
In the case of :py:meth:`~pynwb.base.TimeSeries`, the uncorrected timestamps generated by the acquisition
system can be stored (or linked) in the *sync* group. In the NWB format, hardware-recorded time data
must then be corrected to a common time base (e.g., timestamps from all hardware sources aligned) before
it can be included in the *timestamps* of the *TimeSeries*. This means, in the case
of :py:meth:`~pynwb.base.TimeSeries` we need to be careful that we are not including data with incompatible
timestamps in the same file when using external links.
.. warning::
External links can become stale/break. Since external links are pointing to data in other files
external links may become invalid any time files are modified on the file system, e.g., renamed,
moved or access permissions are changed.
Creating test data
---------------------------
In the following we are creating two :py:meth:`~pynwb.base.TimeSeries` each written to a separate file.
We then show how we can integrate these files into a single NWBFile.
"""

# sphinx_gallery_thumbnail_path = 'figures/gallery_thumbnails_linking_data.png'

from datetime import datetime
from uuid import uuid4

Expand Down
2 changes: 1 addition & 1 deletion docs/gallery/general/extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ def __init__(self, **kwargs):
# explicitly specify this. This behavior is enabled by the *load_namespaces*
# argument to the :py:class:`~pynwb.NWBHDF5IO` constructor.

with NWBHDF5IO("cache_spec_example.nwb", mode="r", load_namespaces=True) as io:
with NWBHDF5IO("cache_spec_example.nwb", mode="r") as io:
nwbfile = io.read()

####################
Expand Down
17 changes: 6 additions & 11 deletions src/pynwb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
import os.path
from pathlib import Path
from copy import deepcopy
from warnings import warn
import h5py

from hdmf.spec import NamespaceCatalog
Expand Down Expand Up @@ -244,8 +243,9 @@ def can_read(path: str):
'doc': 'the mode to open the HDF5 file with, one of ("w", "r", "r+", "a", "w-", "x")',
'default': 'r'},
{'name': 'load_namespaces', 'type': bool,
'doc': 'whether or not to load cached namespaces from given path - not applicable in write mode',
'default': False},
'doc': ('whether or not to load cached namespaces from given path - not applicable in write mode '
'or when `manager` is not None or when `extensions` is not None'),
'default': True},
{'name': 'manager', 'type': BuildManager, 'doc': 'the BuildManager to use for I/O', 'default': None},
{'name': 'extensions', 'type': (str, TypeMap, list),
'doc': 'a path to a namespace, a TypeMap, or a list consisting paths to namespaces and TypeMaps',
Expand All @@ -261,15 +261,10 @@ def __init__(self, **kwargs):
popargs('path', 'mode', 'manager', 'extensions', 'load_namespaces',
'file', 'comm', 'driver', 'herd_path', kwargs)
# Define the BuildManager to use
if load_namespaces:
if manager is not None:
warn("loading namespaces from file - ignoring 'manager'")
if extensions is not None:
warn("loading namespaces from file - ignoring 'extensions' argument")
# namespaces are not loaded when creating an NWBHDF5IO object in write mode
if 'w' in mode or mode == 'x':
raise ValueError("cannot load namespaces from file when writing to it")
if mode in 'wx' or manager is not None or extensions is not None:
load_namespaces = False

if load_namespaces:
tm = get_type_map()
super().load_namespaces(tm, path, file=file_obj, driver=driver)
manager = BuildManager(tm)
Expand Down
1 change: 1 addition & 0 deletions src/pynwb/validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ def validate(**kwargs):
file=sys.stderr,
)
else:
io_kwargs.update(load_namespaces=False)
namespaces_to_validate = [CORE_NAMESPACE]

if namespace is not None:
Expand Down
1 change: 0 additions & 1 deletion tests/back_compat/test_import_structure.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,6 @@ def test_outer_import_structure(self):
"spec",
"testing",
"validate",
"warn",
]
for member in expected_structure:
self.assertIn(member=member, container=current_structure)
26 changes: 18 additions & 8 deletions tests/back_compat/test_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,16 @@ class TestReadOldVersions(TestCase):
"- expected an array of shape '[None]', got non-array data 'one publication'")],
}

def get_io(self, path):
"""Get an NWBHDF5IO object for the given path."""
with warnings.catch_warnings():
warnings.filterwarnings(
"ignore",
message=r"Ignoring cached namespace .*",
category=UserWarning,
)
return NWBHDF5IO(str(path), 'r')

def test_read(self):
"""Test reading and validating all NWB files in the same folder as this file.
Expand All @@ -43,7 +53,7 @@ def test_read(self):
with self.subTest(file=f.name):
with warnings.catch_warnings(record=True) as warnings_on_read:
warnings.simplefilter("always")
with NWBHDF5IO(str(f), 'r', load_namespaces=True) as io:
with self.get_io(f) as io:
errors = validate(io)
io.read()
for w in warnings_on_read:
Expand All @@ -69,28 +79,28 @@ def test_read(self):
def test_read_timeseries_no_data(self):
"""Test that a TimeSeries written without data is read with data set to the default value."""
f = Path(__file__).parent / '1.5.1_timeseries_no_data.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
np.testing.assert_array_equal(read_nwbfile.acquisition['test_timeseries'].data, TimeSeries.DEFAULT_DATA)

def test_read_timeseries_no_unit(self):
"""Test that an ImageSeries written without unit is read with unit set to the default value."""
f = Path(__file__).parent / '1.5.1_timeseries_no_unit.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertEqual(read_nwbfile.acquisition['test_timeseries'].unit, TimeSeries.DEFAULT_UNIT)

def test_read_imageseries_no_data(self):
"""Test that an ImageSeries written without data is read with data set to the default value."""
f = Path(__file__).parent / '1.5.1_imageseries_no_data.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
np.testing.assert_array_equal(read_nwbfile.acquisition['test_imageseries'].data, ImageSeries.DEFAULT_DATA)

def test_read_imageseries_no_unit(self):
"""Test that an ImageSeries written without unit is read with unit set to the default value."""
f = Path(__file__).parent / '1.5.1_imageseries_no_unit.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertEqual(read_nwbfile.acquisition['test_imageseries'].unit, ImageSeries.DEFAULT_UNIT)

Expand All @@ -100,7 +110,7 @@ def test_read_imageseries_non_external_format(self):
f = Path(__file__).parent / fbase
expected_warning = self.expected_warnings[fbase][0]
with self.assertWarnsWith(UserWarning, expected_warning):
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertEqual(read_nwbfile.acquisition['test_imageseries'].format, "tiff")

Expand All @@ -110,13 +120,13 @@ def test_read_imageseries_nonmatch_starting_frame(self):
f = Path(__file__).parent / fbase
expected_warning = self.expected_warnings[fbase][0]
with self.assertWarnsWith(UserWarning, expected_warning):
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
np.testing.assert_array_equal(read_nwbfile.acquisition['test_imageseries'].starting_frame, [1, 2, 3])

def test_read_subject_no_age__reference(self):
"""Test that reading a Subject without an age__reference set with NWB schema 2.5.0 sets the value to None"""
f = Path(__file__).parent / '2.2.0_subject_no_age__reference.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertIsNone(read_nwbfile.subject.age__reference)
2 changes: 1 addition & 1 deletion tests/read_dandi/test_read_dandi.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def read_first_nwb_asset():
s3_url = first_asset.get_content_url(follow_redirects=1, strip_query=True)

try:
with NWBHDF5IO(path=s3_url, load_namespaces=True, driver="ros3") as io:
with NWBHDF5IO(path=s3_url, driver="ros3") as io:
io.read()
except Exception as e:
print(traceback.format_exc())
Expand Down
12 changes: 8 additions & 4 deletions tests/unit/test_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -527,6 +527,7 @@ def test_subject_age_duration(self):


class TestCacheSpec(TestCase):
"""Test whether the file can be written and read when caching the spec."""

def setUp(self):
self.path = 'unittest_cached_spec.nwb'
Expand All @@ -535,18 +536,20 @@ def tearDown(self):
remove_test_file(self.path)

def test_simple(self):
nwbfile = NWBFile(' ', ' ',
nwbfile = NWBFile('sess_desc', 'identifier',
datetime.now(tzlocal()),
file_create_date=datetime.now(tzlocal()),
institution='University of California, San Francisco',
lab='Chang Lab')
with NWBHDF5IO(self.path, 'w') as io:
io.write(nwbfile)
with NWBHDF5IO(self.path, 'r', load_namespaces=True) as reader:
with NWBHDF5IO(self.path, 'r') as reader:
nwbfile = reader.read()
assert nwbfile.session_description == "sess_desc"


class TestNoCacheSpec(TestCase):
"""Test whether the file can be written and read when not caching the spec."""

def setUp(self):
self.path = 'unittest_cached_spec.nwb'
Expand All @@ -555,16 +558,17 @@ def tearDown(self):
remove_test_file(self.path)

def test_simple(self):
nwbfile = NWBFile(' ', ' ',
nwbfile = NWBFile('sess_desc', 'identifier',
datetime.now(tzlocal()),
file_create_date=datetime.now(tzlocal()),
institution='University of California, San Francisco',
lab='Chang Lab')
with NWBHDF5IO(self.path, 'w') as io:
io.write(nwbfile, cache_spec=False)

with NWBHDF5IO(self.path, 'r', load_namespaces=True) as reader:
with NWBHDF5IO(self.path, 'r') as reader:
nwbfile = reader.read()
assert nwbfile.session_description == "sess_desc"


class TestTimestampsRefDefault(TestCase):
Expand Down
Loading

0 comments on commit 2a70ca9

Please sign in to comment.