Skip to content

Commit

Permalink
Merge branch 'dev' into provenance-tracking
Browse files Browse the repository at this point in the history
  • Loading branch information
stephprince authored Sep 19, 2024
2 parents 7cdec94 + 17adccf commit b6d3152
Show file tree
Hide file tree
Showing 29 changed files with 620 additions and 165 deletions.
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,6 @@ Show how to reproduce the new behavior (can be a bug fix or a new feature)
- [ ] Did you update CHANGELOG.md with your changes?
- [ ] Have you checked our [Contributing](https://github.com/NeurodataWithoutBorders/pynwb/blob/dev/docs/CONTRIBUTING.rst) document?
- [ ] Have you ensured the PR clearly describes the problem and the solution?
- [ ] Is your contribution compliant with our coding style? This can be checked running `flake8` from the source directory.
- [ ] Is your contribution compliant with our coding style? This can be checked running `ruff check . && codespell` from the source directory.
- [ ] Have you checked to ensure that there aren't other open [Pull Requests](https://github.com/NeurodataWithoutBorders/pynwb/pulls) for the same change?
- [ ] Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.
9 changes: 5 additions & 4 deletions .github/workflows/run_all_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,9 @@ jobs:
- { name: windows-python3.12 , test-tox-env: py312 , build-tox-env: build-py312 , python-ver: "3.12", os: windows-latest }
- { name: windows-python3.12-upgraded , test-tox-env: py312-upgraded , build-tox-env: build-py312-upgraded , python-ver: "3.12", os: windows-latest }
- { name: windows-python3.12-prerelease, test-tox-env: py312-prerelease, build-tox-env: build-py312-prerelease, python-ver: "3.11", os: windows-latest }
# minimum versions of dependencies do not have wheels or cannot be built on macos-arm64
- { name: macos-python3.8-minimum , test-tox-env: py38-minimum , build-tox-env: build-py38-minimum , python-ver: "3.8" , os: macos-13 }
- { name: macos-python3.9 , test-tox-env: py39 , build-tox-env: build-py39 , python-ver: "3.9" , os: macos-13 }
- { name: macos-python3.9 , test-tox-env: py39 , build-tox-env: build-py39 , python-ver: "3.9" , os: macos-latest }
- { name: macos-python3.10 , test-tox-env: py310 , build-tox-env: build-py310 , python-ver: "3.10", os: macos-latest }
- { name: macos-python3.11 , test-tox-env: py311 , build-tox-env: build-py311 , python-ver: "3.11", os: macos-latest }
- { name: macos-python3.11-opt , test-tox-env: py311-optional , build-tox-env: build-py311 , python-ver: "3.11", os: macos-latest }
Expand Down Expand Up @@ -98,6 +99,7 @@ jobs:
- { name: windows-gallery-python3.8-minimum , test-tox-env: gallery-py38-minimum , python-ver: "3.8" , os: windows-latest }
- { name: windows-gallery-python3.12-upgraded , test-tox-env: gallery-py312-upgraded , python-ver: "3.12", os: windows-latest }
- { name: windows-gallery-python3.12-prerelease, test-tox-env: gallery-py312-prerelease, python-ver: "3.12", os: windows-latest }
# minimum versions of dependencies do not have wheels or cannot be built on macos-arm64
- { name: macos-gallery-python3.8-minimum , test-tox-env: gallery-py38-minimum , python-ver: "3.8" , os: macos-13 }
- { name: macos-gallery-python3.12-upgraded , test-tox-env: gallery-py312-upgraded , python-ver: "3.12", os: macos-latest }
- { name: macos-gallery-python3.12-prerelease , test-tox-env: gallery-py312-prerelease, python-ver: "3.12", os: macos-latest }
Expand Down Expand Up @@ -201,7 +203,7 @@ jobs:
include:
- { name: conda-linux-python3.12-ros3 , python-ver: "3.12", os: ubuntu-latest }
- { name: conda-windows-python3.12-ros3, python-ver: "3.12", os: windows-latest }
- { name: conda-macos-python3.12-ros3 , python-ver: "3.12", os: macos-13 } # This is due to DANDI not supporting osx-arm64. Will support macos-latest when this changes.
- { name: conda-macos-python3.12-ros3 , python-ver: "3.12", os: macos-latest }
steps:
- name: Cancel non-latest runs
uses: styfle/[email protected]
Expand Down Expand Up @@ -248,7 +250,7 @@ jobs:
include:
- { name: conda-linux-gallery-python3.12-ros3 , python-ver: "3.12", os: ubuntu-latest }
- { name: conda-windows-gallery-python3.12-ros3, python-ver: "3.12", os: windows-latest }
- { name: conda-macos-gallery-python3.12-ros3 , python-ver: "3.12", os: macos-13 } # This is due to DANDI not supporting osx-arm64. Will support macos-latest when this changes.
- { name: conda-macos-gallery-python3.12-ros3 , python-ver: "3.12", os: macos-latest }
steps:
- name: Cancel non-latest runs
uses: styfle/[email protected]
Expand All @@ -273,7 +275,6 @@ jobs:

- name: Install run dependencies
run: |
pip install matplotlib
pip install .
pip list
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
- { name: linux-python3.12-upgraded , test-tox-env: py312-upgraded , build-tox-env: build-py312-upgraded , python-ver: "3.12", os: ubuntu-latest , upload-wheels: true }
- { name: windows-python3.8-minimum , test-tox-env: py38-minimum , build-tox-env: build-py38-minimum , python-ver: "3.8" , os: windows-latest }
- { name: windows-python3.12-upgraded , test-tox-env: py312-upgraded , build-tox-env: build-py312-upgraded , python-ver: "3.12", os: windows-latest }
# minimum versions of dependencies do not have wheels or cannot be built on macos-arm64
- { name: macos-python3.8-minimum , test-tox-env: py38-minimum , build-tox-env: build-py38-minimum , python-ver: "3.8" , os: macos-13 }
steps:
- name: Cancel non-latest runs
Expand Down Expand Up @@ -63,7 +64,7 @@ jobs:
- name: Upload distribution as a workspace artifact
if: ${{ matrix.upload-wheels }}
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: distributions
path: dist
Expand Down Expand Up @@ -282,7 +283,7 @@ jobs:
python-version: '3.12'

- name: Download wheel and source distributions from artifact
uses: actions/download-artifact@v3
uses: actions/download-artifact@v4
with:
name: distributions
path: dist
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,6 @@ tests/coverage/htmlcov

# Version
_version.py

.core_typemap_version
core_typemap.pkl
24 changes: 22 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,35 @@
# PyNWB Changelog

## PyNWB 2.8.1 (Upcoming)
## PyNWB 2.8.3 (Upcoming)

### Performance
- Cache global type map to speed import 3X. @sneakers-the-rat [#1931](https://github.com/NeurodataWithoutBorders/pynwb/pull/1931)

## PyNWB 2.8.2 (September 9, 2024)

### Enhancements and minor changes
- Added support for numpy 2.0. @mavaylon1 [#1956](https://github.com/NeurodataWithoutBorders/pynwb/pull/1956)
- Make `get_cached_namespaces_to_validate` a public function @stephprince [#1961](https://github.com/NeurodataWithoutBorders/pynwb/pull/1961)

### Documentation and tutorial enhancements
- Added pre-release pull request instructions to release process documentation @stephprince [#1928](https://github.com/NeurodataWithoutBorders/pynwb/pull/1928)
- Added section on how to use the `family` driver in `h5py` for splitting data across multiple files @oruebel [#1949](https://github.com/NeurodataWithoutBorders/pynwb/pull/1949)

### Bug fixes
- Fixed `can_read` method to return False if no nwbfile version can be found @stephprince [#1934](https://github.com/NeurodataWithoutBorders/pynwb/pull/1934)
- Changed `epoch_tags` to be a NWBFile property instead of constructor argument. @stephprince [#1935](https://github.com/NeurodataWithoutBorders/pynwb/pull/1935)
- Exposed option to not cache the spec in `NWBHDF5IO.export`. @rly [#1959](https://github.com/NeurodataWithoutBorders/pynwb/pull/1959)

## PyNWB 2.8.1 (July 3, 2024)

### Documentation and tutorial enhancements
- Simplified the introduction to NWB tutorial. @rly [#1914](https://github.com/NeurodataWithoutBorders/pynwb/pull/1914)
- Simplified the ecephys and ophys tutorials. [#1915](https://github.com/NeurodataWithoutBorders/pynwb/pull/1915)
- Add comments to `src/pynwb/io/file.py` to improve developer documentation. @rly [#1925](https://github.com/NeurodataWithoutBorders/pynwb/pull/1925)

### Bug fixes
- Fixed use of `channel_conversion` in `TimeSeries` `get_data_in_units`. @rohanshah [1923](https://github.com/NeurodataWithoutBorders/pynwb/pull/1923)


## PyNWB 2.8.0 (May 28, 2024)

### Enhancements and minor changes
Expand Down
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ Overall Health
:target: https://github.com/neurodatawithoutborders/pynwb/blob/dev/license.txt
:alt: PyPI - License

**Conda**
**Conda Feedstock**

.. image:: https://circleci.com/gh/conda-forge/pynwb-feedstock.svg?style=shield
:target: https://circleci.com/gh/conda-forge/pynwb-feedstock
.. image:: https://dev.azure.com/conda-forge/feedstock-builds/_apis/build/status/pynwb-feedstock?branchName=main
:target: https://dev.azure.com/conda-forge/feedstock-builds/_build/latest?definitionId=5703&branchName=main
:alt: Conda Feedstock Status

NWB Format API
Expand Down
2 changes: 2 additions & 0 deletions docs/gallery/advanced_io/plot_iterative_write.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""
.. _iterative_write:
Iterative Data Write
====================
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
HDF5 files with NWB data files via external links. To make things more concrete, let's look at the following use
case. We want to simultaneously record multiple data streams during data acquisition. Using the concept of external
links allows us to save each data stream to an external HDF5 files during data acquisition and to
afterwards link the data into a single NWB file. In this case, each recording becomes represented by a
afterward link the data into a single NWB file. In this case, each recording becomes represented by a
separate file-system object that can be set as read-only once the experiment is done. In the following
we are using :py:meth:`~pynwb.base.TimeSeries` as an example, but the same approach works for other
NWBContainers as well.
Expand Down Expand Up @@ -42,7 +42,7 @@
Creating test data
---------------------------
^^^^^^^^^^^^^^^^^^
In the following we are creating two :py:meth:`~pynwb.base.TimeSeries` each written to a separate file.
We then show how we can integrate these files into a single NWBFile.
Expand All @@ -61,7 +61,7 @@
# Create the base data
start_time = datetime(2017, 4, 3, 11, tzinfo=tzlocal())
data = np.arange(1000).reshape((100, 10))
timestamps = np.arange(100)
timestamps = np.arange(100, dtype=float)
filename1 = "external1_example.nwb"
filename2 = "external2_example.nwb"
filename3 = "external_linkcontainer_example.nwb"
Expand Down Expand Up @@ -105,12 +105,12 @@

#####################
# Linking to select datasets
# --------------------------
# ^^^^^^^^^^^^^^^^^^^^^^^^^^
#

####################
# Step 1: Create the new NWBFile
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# Create the first file
nwbfile4 = NWBFile(
Expand All @@ -122,7 +122,7 @@

####################
# Step 2: Get the dataset you want to link to
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Now let's open our test files and retrieve our timeseries.
#

Expand All @@ -134,7 +134,7 @@

####################
# Step 3: Create the object you want to link to the data
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# To link to the dataset we can simply assign the data object (here `` timeseries_1.data``) to a new ``TimeSeries``

Expand Down Expand Up @@ -167,7 +167,7 @@

####################
# Step 4: Write the data
# ^^^^^^^^^^^^^^^^^^^^^^^
# ~~~~~~~~~~~~~~~~~~~~~~~~
#
with NWBHDF5IO(filename4, "w") as io4:
# Use link_data=True to specify default behavior to link rather than copy data
Expand All @@ -185,7 +185,7 @@

####################
# Linking to whole Containers
# ---------------------------
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Appending to files and linking is made possible by passing around the same
# :py:class:`~hdmf.build.manager.BuildManager`. You can get a manager to pass around
Expand All @@ -203,7 +203,7 @@

####################
# Step 1: Get the container object you want to link to
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Now let's open our test files and retrieve our timeseries.
#

Expand All @@ -219,7 +219,7 @@

####################
# Step 2: Add the container to another NWBFile
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# To integrate both :py:meth:`~pynwb.base.TimeSeries` into a single file we simply create a new
# :py:meth:`~pynwb.file.NWBFile` and add our existing :py:meth:`~pynwb.base.TimeSeries` to it. PyNWB's
# :py:class:`~pynwb.NWBHDF5IO` backend then automatically detects that the TimeSeries have already
Expand Down Expand Up @@ -247,7 +247,7 @@
# ------------------------------
#
# Using the :py:func:`~pynwb.file.NWBFile.copy` method allows us to easily create a shallow copy
# of a whole NWB:N file with links to all data in the original file. For example, we may want to
# of a whole NWB file with links to all data in the original file. For example, we may want to
# store processed data in a new file separate from the raw data, while still being able to access
# the raw data. See the :ref:`scratch` tutorial for a detailed example.
#
Expand All @@ -259,5 +259,128 @@
# External links are convenient but to share data we may want to hand a single file with all the
# data to our collaborator rather than having to collect all relevant files. To do this,
# :py:class:`~hdmf.backends.hdf5.h5tools.HDF5IO` (and in turn :py:class:`~pynwb.NWBHDF5IO`)
# provide the convenience function :py:meth:`~hdmf.backends.hdf5.h5tools.HDF5IO.copy_file`,
# which copies an HDF5 file and resolves all external links.
# provide the convenience function :py:meth:`~hdmf.backends.hdf5.h5tools.HDF5IO.export`,
# which can copy the file and resolves all external links.


####################
# Automatically splitting large data across multiple HDF5 files
# -------------------------------------------------------------------
#
# For extremely large datasets it can be useful to split data across multiple files, e.g., in cases where
# the file stystem does not allow for large files. While we can achieve this by writing different
# components (e.g., :py:meth:`~pynwb.base.TimeSeries`) to different files as described above,
# this option does not allow splitting data from single datasets. An alternative option is to use the
# ``family`` driver in ``h5py`` to automatically split the NWB file into a collection of many HDF5 files.
# The ``family`` driver stores the file on disk as a series of fixed-length chunks (each in its own file).
# In practice, to write very large arrays, we can combine this approach with :ref:`iterative_write` to
# avoid having to load all data into memory. In the example shown here we use a manual approach to
# iterative write by using :py:class:`~hdmf.backends.hdf5.h5_utils.H5DataIO` to create an empty dataset and
# then filling in the data afterward.

####################
# Step 1: Create the NWBFile as usual
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

from pynwb import NWBFile
from pynwb.base import TimeSeries
from datetime import datetime
from hdmf.backends.hdf5 import H5DataIO
import numpy as np

# Create an NWBFile object
nwbfile = NWBFile(session_description='example file family',
identifier=str(uuid4()),
session_start_time=datetime.now().astimezone())

# Create the data as an empty dataset so that we can write to it later
data = H5DataIO(maxshape=(None, 10), # make the first dimension expandable
dtype=np.float32, # create the data as float32
shape=(0, 10), # initial data shape to initialize as empty dataset
chunks=(1000, 10)
)

# Create a TimeSeries object
time_series = TimeSeries(name='example_timeseries',
data=data,
starting_time=0.0,
rate=1.0,
unit='mV')

# Add the TimeSeries to the NWBFile
nwbfile.add_acquisition(time_series)

####################
# Step 2: Open the new file with the `family` driver and write
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Here we need to open the file with `h5py` first to set up the driver, and then we can use
# that file with :py:class:`pynwb.NWBHDF5IO`. This is required, because :py:class:`pynwb.NWBHDF5IO`
# currently does not support passing the `memb_size` option required by the `family` driver.

import h5py
from pynwb import NWBHDF5IO

# Define the size of the individual files, determining the number of files to create
# chunk_size = 1 * 1024**3 # 1GB per file
chunk_size = 1024**2 # 1MB just for testing

# filename pattern
filename_pattern = 'family_nwb_file_%d.nwb'

# Create the HDF5 file using the family driver
with h5py.File(name=filename_pattern, mode='w', driver='family', memb_size=chunk_size) as f:

# Use NWBHDF5IO to write the NWBFile to the HDF5 file
with NWBHDF5IO(file=f, mode='w') as io:
io.write(nwbfile)

# Write new data iteratively to the file
for i in range(10):
start_index = i * 1000
stop_index = start_index + 1000
data.dataset.resize((stop_index, 10)) # Resize the dataset
data.dataset[start_index: stop_index , :] = i # Set the additional values

####################
# .. note::
#
# Alternatively, we could have also used the :ref:`iterative_write` features to write the data
# iteratively directly as part of the `io.write` call instead of manually afterward.

####################
# Step 3: Read a file written with the family driver
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#


# Open the HDF5 file using the family driver
with h5py.File(name=filename_pattern, mode='r', driver='family', memb_size=chunk_size) as f:
# Use NWBHDF5IO to read the NWBFile from the HDF5 file
with NWBHDF5IO(file=f, manager=None, mode='r') as io:
nwbfile = io.read()
print(nwbfile)


####################
# .. note::
#
# The filename you provide when using the ``family`` driver must contain a printf-style integer format code
# (e.g.`%d`), which will be replaced by the file sequence number.
#
# .. note::
#
# The ``memb_size`` parameter must be set on both write and read. As such, reading the file requires
# the user to know the ``memb_size`` that was used for writing.
#
# .. warning::
#
# The DANDI archive may not support NWB files that are split in this fashion.
#
# .. note::
#
# Other file drivers, e.g., ``split`` or ``multi`` could be used in a similar fashion.
# However, not all HDF5 drivers are supported by the the high-level API of
# ``h5py`` and as such may require a bit more complex setup via the the
# low-level HDF5 API in ``h5py``.
#

Loading

0 comments on commit b6d3152

Please sign in to comment.