Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default load_namespaces=True #1748

Merged
merged 26 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0345a46
default load_namespaces=True
bendichter Jul 28, 2023
792451d
Update CHANGELOG.md
bendichter Jul 28, 2023
382f0fb
change logic around to ignore load_namespaces when it cannot be used
bendichter Jul 28, 2023
acb1c8d
Merge remote-tracking branch 'origin/load_namespaces_true' into load_…
bendichter Jul 28, 2023
26faa0c
flake8
bendichter Jul 29, 2023
955fb8d
Merge branch 'dev' into load_namespaces_true
bendichter Jul 29, 2023
2c6a415
Update __init__.py
rly Jul 29, 2023
7f1a1e3
Merge branch 'dev' into load_namespaces_true
bendichter Jul 29, 2023
9d5ab33
fix back-compatibility test
bendichter Jul 31, 2023
41ba0db
Merge remote-tracking branch 'origin/load_namespaces_true' into load_…
bendichter Jul 31, 2023
6c941a0
flake8
bendichter Jul 31, 2023
f8e35d3
flake8
bendichter Jul 31, 2023
df2a32e
remove warning from test
bendichter Aug 1, 2023
e7023c2
Merge branch 'dev' into load_namespaces_true
oruebel Sep 6, 2023
8087153
rmv useless test
bendichter Sep 21, 2023
8a95beb
Merge remote-tracking branch 'origin/load_namespaces_true' into load_…
bendichter Sep 21, 2023
28ba0d1
Merge branch 'dev' into load_namespaces_true
bendichter Sep 21, 2023
6c4d772
Merge branch 'dev' into load_namespaces_true
rly Oct 4, 2023
a833c9e
Merge branch 'dev' into load_namespaces_true
rly Oct 4, 2023
8b86a20
Update tests, validate script for default load ns
rly Oct 4, 2023
ac71956
Update CHANGELOG.md
rly Oct 4, 2023
2384a69
Update docs/gallery/advanced_io/linking_data.py
rly Oct 4, 2023
a0daffa
Update docs/gallery/advanced_io/linking_data.py
rly Oct 4, 2023
930eef6
Update test_file.py
rly Oct 4, 2023
a8cf307
Merge branch 'dev' into load_namespaces_true
rly Oct 22, 2023
ad0e08f
Merge branch 'dev' into load_namespaces_true
rly Nov 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# PyNWB Changelog

## PyNWB 2.6.0 (Upcoming)

### Enhancements and minor changes
- For `NWBHDF5IO()`, change the default of arg `load_namespaces` from `False` to `True`. @bendichter [#1748](https://github.com/NeurodataWithoutBorders/pynwb/pull/1748)

## PyNWB 2.5.0 (August 18, 2023)

### Enhancements and minor changes
Expand Down
81 changes: 37 additions & 44 deletions docs/gallery/advanced_io/linking_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,57 +6,50 @@

PyNWB supports linking between files using external links.

"""
Example Use Case: Integrating data from multiple files
---------------------------------------------------------

####################
# Example Use Case: Integrating data from multiple files
# ---------------------------------------------------------
#
# NBWContainer classes (e.g., :py:class:`~pynwb.base.TimeSeries`) support the integration of data stored in external
# HDF5 files with NWB data files via external links. To make things more concrete, let's look at the following use
# case. We want to simultaneously record multiple data streams during data acquisition. Using the concept of external
# links allows us to save each data stream to an external HDF5 files during data acquisition and to
# afterwards link the data into a single NWB:N file. In this case, each recording becomes represented by a
# separate file-system object that can be set as read-only once the experiment is done. In the following
# we are using :py:meth:`~pynwb.base.TimeSeries` as an example, but the same approach works for other
# NWBContainers as well.
#
NBWContainer classes (e.g., :py:class:`~pynwb.base.TimeSeries`) support the integration of data stored in external
HDF5 files with NWB data files via external links. To make things more concrete, let's look at the following use
case. We want to simultaneously record multiple data streams during data acquisition. Using the concept of external
links allows us to save each data stream to an external HDF5 files during data acquisition and to
afterwards link the data into a single NWB:N file. In this case, each recording becomes represented by a
rly marked this conversation as resolved.
Show resolved Hide resolved
separate file-system object that can be set as read-only once the experiment is done. In the following
we are using :py:meth:`~pynwb.base.TimeSeries` as an example, but the same approach works for other
NWBContainers as well.

####################
# .. tip::
#
# The same strategies we use here for creating External Links also apply to Soft Links.
# The main difference between soft and external links is that soft links point to other
# objects within the same file while external links point to objects in external files.
#
.. tip::

####################
# .. tip::
#
# In the case of :py:meth:`~pynwb.base.TimeSeries`, the uncorrected timestamps generated by the acquisition
# system can be stored (or linked) in the *sync* group. In the NWB:N format, hardware-recorded time data
# must then be corrected to a common time base (e.g., timestamps from all hardware sources aligned) before
# it can be included in the *timestamps* of the *TimeSeries*. This means, in the case
# of :py:meth:`~pynwb.base.TimeSeries` we need to be careful that we are not including data with incompatible
# timestamps in the same file when using external links.
#
The same strategies we use here for creating External Links also apply to Soft Links.
The main difference between soft and external links is that soft links point to other
objects within the same file while external links point to objects in external files.

####################
# .. warning::
#
# External links can become stale/break. Since external links are pointing to data in other files
# external links may become invalid any time files are modified on the file system, e.g., renamed,
# moved or access permissions are changed.
#
.. tip::

####################
# Creating test data
# ---------------------------
#
# In the following we are creating two :py:meth:`~pynwb.base.TimeSeries` each written to a separate file.
# We then show how we can integrate these files into a single NWBFile.
In the case of :py:meth:`~pynwb.base.TimeSeries`, the uncorrected timestamps generated by the acquisition
system can be stored (or linked) in the *sync* group. In the NWB:N format, hardware-recorded time data
rly marked this conversation as resolved.
Show resolved Hide resolved
must then be corrected to a common time base (e.g., timestamps from all hardware sources aligned) before
it can be included in the *timestamps* of the *TimeSeries*. This means, in the case
of :py:meth:`~pynwb.base.TimeSeries` we need to be careful that we are not including data with incompatible
timestamps in the same file when using external links.


.. warning::

External links can become stale/break. Since external links are pointing to data in other files
external links may become invalid any time files are modified on the file system, e.g., renamed,
moved or access permissions are changed.


Creating test data
---------------------------

In the following we are creating two :py:meth:`~pynwb.base.TimeSeries` each written to a separate file.
We then show how we can integrate these files into a single NWBFile.
"""

# sphinx_gallery_thumbnail_path = 'figures/gallery_thumbnails_linking_data.png'

from datetime import datetime
from uuid import uuid4

Expand Down
2 changes: 1 addition & 1 deletion docs/gallery/general/extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ def __init__(self, **kwargs):
# explicitly specify this. This behavior is enabled by the *load_namespaces*
# argument to the :py:class:`~pynwb.NWBHDF5IO` constructor.

with NWBHDF5IO("cache_spec_example.nwb", mode="r", load_namespaces=True) as io:
with NWBHDF5IO("cache_spec_example.nwb", mode="r") as io:
nwbfile = io.read()

####################
Expand Down
17 changes: 6 additions & 11 deletions src/pynwb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
import os.path
from pathlib import Path
from copy import deepcopy
from warnings import warn
import h5py

from hdmf.spec import NamespaceCatalog
Expand Down Expand Up @@ -206,8 +205,9 @@ class NWBHDF5IO(_HDF5IO):
'doc': 'the mode to open the HDF5 file with, one of ("w", "r", "r+", "a", "w-", "x")',
'default': 'r'},
{'name': 'load_namespaces', 'type': bool,
'doc': 'whether or not to load cached namespaces from given path - not applicable in write mode',
'default': False},
'doc': ('whether or not to load cached namespaces from given path - not applicable in write mode '
'or when `manager` is not None or when `extensions` is not None'),
'default': True},
{'name': 'manager', 'type': BuildManager, 'doc': 'the BuildManager to use for I/O', 'default': None},
{'name': 'extensions', 'type': (str, TypeMap, list),
'doc': 'a path to a namespace, a TypeMap, or a list consisting paths to namespaces and TypeMaps',
Expand All @@ -223,15 +223,10 @@ def __init__(self, **kwargs):
popargs('path', 'mode', 'manager', 'extensions', 'load_namespaces',
'file', 'comm', 'driver', 'herd_path', kwargs)
# Define the BuildManager to use
if load_namespaces:
if manager is not None:
warn("loading namespaces from file - ignoring 'manager'")
if extensions is not None:
warn("loading namespaces from file - ignoring 'extensions' argument")
# namespaces are not loaded when creating an NWBHDF5IO object in write mode
if 'w' in mode or mode == 'x':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here was correct.

raise ValueError("cannot load namespaces from file when writing to it")
if mode in 'wx' or manager is not None or extensions is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this introduced a bug.

load_namespaces = False

if load_namespaces:
tm = get_type_map()
super().load_namespaces(tm, path, file=file_obj, driver=driver)
manager = BuildManager(tm)
Expand Down
1 change: 1 addition & 0 deletions src/pynwb/validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@
file=sys.stderr,
)
else:
io_kwargs.update(load_namespaces=False)

Check warning on line 159 in src/pynwb/validate.py

View check run for this annotation

Codecov / codecov/patch

src/pynwb/validate.py#L159

Added line #L159 was not covered by tests
namespaces_to_validate = [CORE_NAMESPACE]

if namespace is not None:
Expand Down
1 change: 0 additions & 1 deletion tests/back_compat/test_import_structure.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,6 @@ def test_outer_import_structure(self):
"spec",
"testing",
"validate",
"warn",
]
for member in expected_structure:
self.assertIn(member=member, container=current_structure)
26 changes: 18 additions & 8 deletions tests/back_compat/test_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,16 @@ class TestReadOldVersions(TestCase):
"- expected an array of shape '[None]', got non-array data 'one publication'")],
}

def get_io(self, path):
"""Get an NWBHDF5IO object for the given path."""
with warnings.catch_warnings():
warnings.filterwarnings(
"ignore",
message=r"Ignoring cached namespace .*",
category=UserWarning,
)
return NWBHDF5IO(str(path), 'r')

def test_read(self):
"""Test reading and validating all NWB files in the same folder as this file.

Expand All @@ -43,7 +53,7 @@ def test_read(self):
with self.subTest(file=f.name):
with warnings.catch_warnings(record=True) as warnings_on_read:
warnings.simplefilter("always")
with NWBHDF5IO(str(f), 'r', load_namespaces=True) as io:
with self.get_io(f) as io:
errors = validate(io)
io.read()
for w in warnings_on_read:
Expand All @@ -69,28 +79,28 @@ def test_read(self):
def test_read_timeseries_no_data(self):
"""Test that a TimeSeries written without data is read with data set to the default value."""
f = Path(__file__).parent / '1.5.1_timeseries_no_data.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
np.testing.assert_array_equal(read_nwbfile.acquisition['test_timeseries'].data, TimeSeries.DEFAULT_DATA)

def test_read_timeseries_no_unit(self):
"""Test that an ImageSeries written without unit is read with unit set to the default value."""
f = Path(__file__).parent / '1.5.1_timeseries_no_unit.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertEqual(read_nwbfile.acquisition['test_timeseries'].unit, TimeSeries.DEFAULT_UNIT)

def test_read_imageseries_no_data(self):
"""Test that an ImageSeries written without data is read with data set to the default value."""
f = Path(__file__).parent / '1.5.1_imageseries_no_data.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
np.testing.assert_array_equal(read_nwbfile.acquisition['test_imageseries'].data, ImageSeries.DEFAULT_DATA)

def test_read_imageseries_no_unit(self):
"""Test that an ImageSeries written without unit is read with unit set to the default value."""
f = Path(__file__).parent / '1.5.1_imageseries_no_unit.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertEqual(read_nwbfile.acquisition['test_imageseries'].unit, ImageSeries.DEFAULT_UNIT)

Expand All @@ -100,7 +110,7 @@ def test_read_imageseries_non_external_format(self):
f = Path(__file__).parent / fbase
expected_warning = self.expected_warnings[fbase][0]
with self.assertWarnsWith(UserWarning, expected_warning):
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertEqual(read_nwbfile.acquisition['test_imageseries'].format, "tiff")

Expand All @@ -110,13 +120,13 @@ def test_read_imageseries_nonmatch_starting_frame(self):
f = Path(__file__).parent / fbase
expected_warning = self.expected_warnings[fbase][0]
with self.assertWarnsWith(UserWarning, expected_warning):
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
np.testing.assert_array_equal(read_nwbfile.acquisition['test_imageseries'].starting_frame, [1, 2, 3])

def test_read_subject_no_age__reference(self):
"""Test that reading a Subject without an age__reference set with NWB schema 2.5.0 sets the value to None"""
f = Path(__file__).parent / '2.2.0_subject_no_age__reference.nwb'
with NWBHDF5IO(str(f), 'r') as io:
with self.get_io(f) as io:
read_nwbfile = io.read()
self.assertIsNone(read_nwbfile.subject.age__reference)
2 changes: 1 addition & 1 deletion tests/read_dandi/test_read_dandi.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def read_first_nwb_asset():
s3_url = first_asset.get_content_url(follow_redirects=1, strip_query=True)

try:
with NWBHDF5IO(path=s3_url, load_namespaces=True, driver="ros3") as io:
with NWBHDF5IO(path=s3_url, driver="ros3") as io:
io.read()
except Exception as e:
print(traceback.format_exc())
Expand Down
4 changes: 2 additions & 2 deletions tests/unit/test_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -542,7 +542,7 @@ def test_simple(self):
lab='Chang Lab')
with NWBHDF5IO(self.path, 'w') as io:
io.write(nwbfile)
with NWBHDF5IO(self.path, 'r', load_namespaces=True) as reader:
with NWBHDF5IO(self.path, 'r') as reader:
nwbfile = reader.read()


Expand All @@ -563,7 +563,7 @@ def test_simple(self):
with NWBHDF5IO(self.path, 'w') as io:
io.write(nwbfile, cache_spec=False)

with NWBHDF5IO(self.path, 'r', load_namespaces=True) as reader:
with NWBHDF5IO(self.path, 'r') as reader:
nwbfile = reader.read()


Expand Down
58 changes: 24 additions & 34 deletions tests/validation/test_validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,64 +199,54 @@ class TestValidateFunction(TestCase):
# 1.0.3_nwbfile.nwb has cached "core" specification
# 1.1.2_nwbfile.nwb has cached "core" and "hdmf-common" specificaitions

def get_io(self, path):
"""Get an NWBHDF5IO object for the given path, ignoring the warning about ignoring cached namespaces."""
with warnings.catch_warnings():
warnings.filterwarnings(
"ignore",
message=r"Ignoring cached namespace .*",
category=UserWarning,
)
return NWBHDF5IO(str(path), 'r')

def test_validate_io_no_cache(self):
"""Test that validating a file with no cached spec against the core namespace succeeds."""
with NWBHDF5IO('tests/back_compat/1.0.2_nwbfile.nwb', 'r') as io:
with self.get_io('tests/back_compat/1.0.2_nwbfile.nwb') as io:
errors = validate(io)
self.assertEqual(errors, [])

def test_validate_io_no_cache_bad_ns(self):
"""Test that validating a file with no cached spec against a specified, unknown namespace fails."""
with NWBHDF5IO('tests/back_compat/1.0.2_nwbfile.nwb', 'r') as io:
with self.get_io('tests/back_compat/1.0.2_nwbfile.nwb') as io:
with self.assertRaisesWith(KeyError, "\"'notfound' not a namespace\""):
validate(io, 'notfound')

def test_validate_io_cached(self):
"""Test that validating a file with cached spec against its cached namespace succeeds."""
with NWBHDF5IO('tests/back_compat/1.1.2_nwbfile.nwb', 'r') as io:
with self.get_io('tests/back_compat/1.1.2_nwbfile.nwb') as io:
errors = validate(io)
self.assertEqual(errors, [])

def test_validate_io_cached_extension(self):
"""Test that validating a file with cached spec against its cached namespaces succeeds."""
with warnings.catch_warnings(record=True):
warnings.filterwarnings(
"ignore",
message=r"Ignoring cached namespace .*",
category=UserWarning,
)
with NWBHDF5IO('tests/back_compat/2.1.0_nwbfile_with_extension.nwb', 'r', load_namespaces=True) as io:
errors = validate(io)
self.assertEqual(errors, [])
with self.get_io('tests/back_compat/2.1.0_nwbfile_with_extension.nwb') as io:
errors = validate(io)
self.assertEqual(errors, [])

def test_validate_io_cached_extension_pass_ns(self):
"""Test that validating a file with cached extension spec against the extension namespace succeeds."""
with warnings.catch_warnings(record=True):
warnings.filterwarnings(
"ignore",
message=r"Ignoring cached namespace .*",
category=UserWarning,
)
with NWBHDF5IO('tests/back_compat/2.1.0_nwbfile_with_extension.nwb', 'r', load_namespaces=True) as io:
errors = validate(io, 'ndx-testextension')
self.assertEqual(errors, [])
with self.get_io('tests/back_compat/2.1.0_nwbfile_with_extension.nwb') as io:
errors = validate(io, 'ndx-testextension')
self.assertEqual(errors, [])

def test_validate_io_cached_core_with_io(self):
"""
For back-compatability, test that validating a file with cached extension spec against the core
namespace succeeds when using the `io` + `namespace` keywords.
"""
with warnings.catch_warnings(record=True):
warnings.filterwarnings(
"ignore",
message=r"Ignoring cached namespace .*",
category=UserWarning,
)
with NWBHDF5IO(
path='tests/back_compat/2.1.0_nwbfile_with_extension.nwb', mode='r', load_namespaces=True
) as io:
results = validate(io=io, namespace="core")
self.assertEqual(results, [])
with self.get_io(path='tests/back_compat/2.1.0_nwbfile_with_extension.nwb') as io:
results = validate(io=io, namespace="core")
self.assertEqual(results, [])

def test_validate_file_cached_extension(self):
"""
Expand Down Expand Up @@ -310,13 +300,13 @@ def test_validate_file_cached_no_cache_bad_ns(self):

def test_validate_io_cached_bad_ns(self):
"""Test that validating a file with cached spec against a specified, unknown namespace fails."""
with NWBHDF5IO('tests/back_compat/1.1.2_nwbfile.nwb', 'r') as io:
with self.get_io('tests/back_compat/1.1.2_nwbfile.nwb') as io:
with self.assertRaisesWith(KeyError, "\"'notfound' not a namespace\""):
validate(io, 'notfound')

def test_validate_io_cached_hdmf_common(self):
"""Test that validating a file with cached spec against the hdmf-common namespace fails."""
with NWBHDF5IO('tests/back_compat/1.1.2_nwbfile.nwb', 'r') as io:
with self.get_io('tests/back_compat/1.1.2_nwbfile.nwb') as io:
# TODO this error should not be different from the error when using the validate script above
msg = "builder must have data type defined with attribute 'data_type'"
with self.assertRaisesWith(ValueError, msg):
Expand Down