Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add support for SQLiteStore for ZarrIO #66

Closed
wants to merge 88 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
9d26290
Add support for using select user-defined zarr stores
oruebel Jan 5, 2023
0995543
Update resolution of references to work also for file-based Zarr stores
oruebel Jan 5, 2023
1a94f17
Update test_io_zarr.py to allow file-based Zarr stores
oruebel Jan 5, 2023
88d5dfb
Add SQLLite test draft
oruebel Jan 5, 2023
7076b82
Updated changelog
oruebel Jan 5, 2023
2bb8d78
Merge branch 'dev' into add/alternate_stores
oruebel Jan 6, 2023
d9a3a75
Merge branch 'dev' into add/alternate_stores
oruebel Jan 6, 2023
1dca7df
Add ZarrIO.file property ease implementation of tests
oruebel Jan 6, 2023
022cf46
Refactored ZarrIO tests for consistency and to run all backends via d…
oruebel Jan 7, 2023
0a59d9c
Update NWBZarrIO to support the new path options from ZarrIO
oruebel Jan 7, 2023
9b77f40
Minor changes to tests and comments
oruebel Jan 7, 2023
6a6b8f0
Update test_io_convert.py to test with all supported zarr.storage bac…
oruebel Jan 7, 2023
4c11f8c
Added docs on how to integrate new backends stores with ZarrIO
oruebel Jan 7, 2023
9a28380
Clarify the docs to integrate stores
oruebel Jan 7, 2023
6fc99cb
Update storage docs to add missing reserved links and groups
oruebel Jan 7, 2023
c85052c
Add DEFAULT_SPEC_LOC_DIR and SUPPORTED_ZARR_STORES module variable of…
oruebel Jan 7, 2023
a03bc32
Minor fixes to TempStore tests
oruebel Jan 7, 2023
e3e06a3
Add Mixin and test cases to test convertion between Zarr and Zarr
oruebel Jan 7, 2023
40c23ca
Update ZarrIO tutorial to describe using custom data stores
oruebel Jan 7, 2023
03b16d4
Update Changelog
oruebel Jan 7, 2023
55daea8
Attempt to fix Windows tests
oruebel Jan 7, 2023
fd00185
Add note on why we set dir on TempStore
oruebel Jan 7, 2023
954cf4f
Remove commented code
oruebel Jan 8, 2023
18191d5
Fix bad test setup
oruebel Jan 8, 2023
bc4386f
Set store paths in child classes
oruebel Jan 8, 2023
b18c41e
Added some more details to integrating_data_stores.rst
oruebel Jan 8, 2023
731deeb
Added SQLiteStore support and tests (some link tests still failing)
oruebel Jan 8, 2023
9c231d4
Fix flake8
oruebel Jan 8, 2023
29dd7f6
Update path calculation for links to fix SQLite linking
oruebel Jan 8, 2023
b4af48c
Fix failing test case
oruebel Jan 8, 2023
c4e4c8b
Move resources readme to avoid including it in the docs
oruebel Jan 8, 2023
39bd120
Filter warnings in NWB conversion tutorial
oruebel Jan 8, 2023
be3d156
Updated changelog
oruebel Jan 8, 2023
87f7c54
Update changelog to add PR links
oruebel Jan 8, 2023
8e54adb
Attempt to fix file access conflict in test suite for Windows
oruebel Jan 8, 2023
998409c
Attempt to fix Permission issues on Windows
oruebel Jan 8, 2023
1ea405a
Attempt to fix Permission issues on Windows
oruebel Jan 8, 2023
e64cd06
Avoid explicit use of zarr.open and use file from ZarrIO.file instead
oruebel Jan 8, 2023
9da70dd
Attempt to catch permission issues on Windows during tests
oruebel Jan 8, 2023
07a9b8c
Attempt to catch permission issues on Windows during tests
oruebel Jan 8, 2023
82ba7ca
Add __del__ and __exit__ to ensure stores are closed on exit/delete
oruebel Jan 9, 2023
df0891b
Catch error on multiple close of SQLite store
oruebel Jan 9, 2023
be1fc14
Fix tests accessing closed SQLite store
oruebel Jan 9, 2023
ba8d74a
Make sure io is delete on TestZarrWriter tests
oruebel Jan 9, 2023
7f19f45
Do not catch permission error on tests to ease debugging on windows CI
oruebel Jan 9, 2023
79c9b4d
Close stores opened to resolve references on ZarrIO close
oruebel Jan 9, 2023
b131006
Add docs for tracking opened stores
oruebel Jan 9, 2023
c9d7261
Merge branch 'dev' into add/alternate_stores
oruebel Jan 11, 2023
7595087
Merge branch 'dev' into add/alternate_stores
oruebel Jan 11, 2023
8d61358
Minor text fixes
rly Jan 17, 2023
c96aad6
Minor text fixes
rly Jan 17, 2023
4a316c8
Minor text fixes
rly Jan 17, 2023
dcbae13
Minor text fixes
rly Jan 17, 2023
2f36fd7
Minor text fixes
rly Jan 17, 2023
5dbff7e
Minor text edits
rly Jan 17, 2023
220ad36
Increase HDMF version to 3.5
oruebel Jan 17, 2023
86cfef2
Removed filepath param from get_builder_exists_on_disk
oruebel Jan 17, 2023
b94bee8
Merge branch 'add/alternate_stores' into add/sqlstore
oruebel Jan 17, 2023
c14ca94
Fix merge error in docs
oruebel Jan 17, 2023
d629f9d
Update integrate new store docs
oruebel Jan 17, 2023
6298a67
Merge branch 'add/alternate_stores' into add/sqlstore
oruebel Jan 17, 2023
92ee3e0
Fix bad documentation of class members of mixins
oruebel Jan 17, 2023
10c983f
Remove references to SQLite store
oruebel Jan 18, 2023
077875d
Add missing message to assert in MixinTestCaseConvert
oruebel Jan 18, 2023
f373727
Consistenlty close file in test when explicitly opened
oruebel Jan 18, 2023
2bce016
Simplify test to reuse IO object
oruebel Jan 18, 2023
5e8fb0a
Fix bugs from bad merge
oruebel Jan 18, 2023
ec3adf2
Fix bugs from bad merge
oruebel Jan 18, 2023
c518569
Fix error from bad merge in CHANGELOG
oruebel Jan 18, 2023
94c7aef
Merge branch 'dev' into add/sqlstore
oruebel Jan 18, 2023
c9faccc
Fix flake8 error due to merge conflict resolve
oruebel Jan 18, 2023
5414adf
Remove unused warning filter
oruebel Jan 18, 2023
d10e03f
Remove unused warning filter
oruebel Jan 18, 2023
6eef41a
Changed SUPPORTED_ZARR_STORES back to tuple
oruebel Jan 18, 2023
d63d439
Updated changelog
oruebel Jan 18, 2023
ebdca10
Fix flake8 for tests
oruebel Jan 18, 2023
63c2035
Merge branch 'dev' into add/sqlstore
oruebel Jan 18, 2023
ad56654
Close EXPORT_PATHS and WRITE_PATHS stores to try and fix Windows tests
oruebel Jan 18, 2023
aac135b
Fix flake8
oruebel Jan 18, 2023
f0d86d0
Attempt to close stores to fix Windows tests
oruebel Jan 18, 2023
d36e738
Do full tearDown and setUp in iteration of conversion tests
oruebel Jan 18, 2023
47d0a90
Remove unnecessary test_stimple test case
oruebel Jan 18, 2023
cf3c571
Added note on closing stores to load_namespaces
oruebel Jan 18, 2023
595c9cb
Attempt to close store in test_cache_spec
oruebel Jan 18, 2023
64525fe
Merge branch 'dev' into add/sqlstore
oruebel Jan 18, 2023
fb601ba
Merge branch 'dev' into add/sqlstore
oruebel Aug 30, 2023
2ad1394
Merge branch 'dev' into add/sqlstore
oruebel Oct 1, 2023
4a29f9a
Update CHANGELOG.md
oruebel Oct 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
# HDMF-ZARR Changelog

## 0.5.0 (Future)

### New Features
* Added support for using ``SQLiteStore`` Zarr storage backend with ``ZarrIO``
@oruebel [#66](https://github.com/hdmf-dev/hdmf-zarr/pull/66)

### Internal changes
* Added ``ZarrIO.__opened_stores_references`` to track Zarr stores opened to resolve
references in order to be able to close those stores as part of ``ZarrIO.close()``
@oruebel [#66](https://github.com/hdmf-dev/hdmf-zarr/pull/66)


## 0.4.0 (Upcoming)

### Enhancements
Expand All @@ -24,6 +36,10 @@
``NestedDirectoryStore`` Zarr storage backends with ``ZarrIO`` and ``NWBZarrIO``.
@oruebel [#62](https://github.com/hdmf-dev/hdmf-zarr/pull/62)

### API Changes
* Removed unused ``filepath`` argument from ``ZarrIO.get_builder_exists_on_disk``
[#62](https://github.com/hdmf-dev/hdmf-zarr/pull/62)

### Minor enhancements
* Updated handling of references on read to simplify future integration of file-based Zarr
stores (e.g., ZipStore or database stores). @oruebel [#62](https://github.com/hdmf-dev/hdmf-zarr/pull/62)
Expand All @@ -37,14 +53,10 @@
* Updated tests to handle upcoming changes to ``HDMFIO``. @rly
[#102](https://github.com/hdmf-dev/hdmf-zarr/pull/102)


### Docs
* Added developer documentation on how to integrate new storage backends with ZarrIO. @oruebel
[#62](https://github.com/hdmf-dev/hdmf-zarr/pull/62)

### API Changes
* Removed unused ``filepath`` argument from ``ZarrIO.get_builder_exists_on_disk`` [#62](https://github.com/hdmf-dev/hdmf-zarr/pull/62)

### Bug fixes
* Fixed error in nightly CI. @rly [#93](https://github.com/hdmf-dev/hdmf-zarr/pull/93)

Expand Down
File renamed without changes.
5 changes: 5 additions & 0 deletions docs/source/integrating_data_stores.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ Updating ZarrIO
particular in case the links to your store also modify the storage schema for links
(e.g., if you need to store additional metadata in order to resolve links to your store).

* Note, if your store has to be closed explicitly (e.g., a SQLiteStore), then any stores
that are opened to resolve references should be added to ``ZarrIO.__opened_stores_references``
list so that the stores are closed when :py:meth:`~hdmf_zarr.backend.ZarrIO.close` is
called.

Updating NWBZarrIO
==================

Expand Down
71 changes: 62 additions & 9 deletions src/hdmf_zarr/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,17 @@
import numpy as np
import tempfile
import logging
from sqlite3 import ProgrammingError as SQLiteProgrammingError

# Zarr imports
import zarr
from zarr.hierarchy import Group
from zarr.core import Array
from zarr.storage import (DirectoryStore,
TempStore,
NestedDirectoryStore)
NestedDirectoryStore,
SQLiteStore)

import numcodecs

# HDMF-ZARR imports
Expand Down Expand Up @@ -66,7 +69,8 @@

SUPPORTED_ZARR_STORES = (DirectoryStore,
TempStore,
NestedDirectoryStore)
NestedDirectoryStore,
SQLiteStore)
"""
Tuple listing all Zarr storage backends supported by ZarrIO
"""
Expand Down Expand Up @@ -113,6 +117,7 @@ def __init__(self, **kwargs):
self.__path = path
self.__file = None
self.__built = dict()
self.__opened_stores_references = [] # Zarr stores that were opened to resolve links that need to be closed
self._written_builders = WriteStatusTracker() # track which builders were written (or read) by this IO object
self.__dci_queue = None # Will be initialized on call to io.write
# Codec class to be used. Alternates, e.g., =numcodecs.JSON
Expand Down Expand Up @@ -152,12 +157,29 @@ def object_codec_class(self):
def open(self):
"""Open the Zarr file"""
if self.__file is None:
self.__file = zarr.open(store=self.path,
mode=self.__mode,
synchronizer=self.__synchronizer)
# Open Zarr file
if isinstance(self.path, SQLiteStore):
overwrite = 'w' in self.__mode
self.__file = zarr.group(store=self.path,
overwrite=overwrite)
else:
self.__file = zarr.open(store=self.path,
mode=self.__mode,
synchronizer=self.__synchronizer)

def close(self):
"""Close the Zarr file"""
if isinstance(self.path, SQLiteStore):
try:
self.path.close()
except SQLiteProgrammingError: # raised if close has been called previously
pass
for store in self.__opened_stores_references:
try:
store.close()
except Exception: # May be raised if close has been called previously
pass

self.__file = None
return

Expand All @@ -170,9 +192,14 @@ def close(self):
'doc': 'the path to the Zarr file or a supported Zarr store'},
{'name': 'namespaces', 'type': list, 'doc': 'the namespaces to load', 'default': None})
def load_namespaces(cls, namespace_catalog, path, namespaces=None):
'''
"""
Load cached namespaces from a file.
'''

.. note::

The function does NOT close the path. E.g., when using a
SQLiteStore remember to close the store.
"""
f = zarr.open(path, 'r')
if SPEC_LOC_ATTR not in f.attrs:
msg = "No cached namespaces found in %s" % path
Expand Down Expand Up @@ -349,6 +376,12 @@ def get_builder_exists_on_disk(self, **kwargs):
builder = getargs('builder', kwargs)
builder_path = self.get_builder_disk_path(builder=builder, filepath=None)
exists_on_disk = os.path.exists(builder_path)
if isinstance(self.path, SQLiteStore):
try:
self.file[self.__get_path(builder)]
exists_on_disk = True
except Exception:
exists_on_disk = False
return exists_on_disk

@docval({'name': 'builder', 'type': Builder, 'doc': 'The builder of interest'},
Expand Down Expand Up @@ -557,6 +590,14 @@ def get_zarr_paths(zarr_object):
:type zarr_object: Zarr Group or Array
:return: Tuple of two string with: 1) path of the Zarr file and 2) full path within the zarr file to the object
"""
filepath = zarr_object.store.path.replace("\\", "/")
objectpath = ("/" + zarr_object.path).replace("\\", "/")
return filepath, objectpath
# NOTE: Leaving this code here for now as there (at least used to be) as reason why we could not
# use the paths from Zarr directly. However, the code below would need to be fixed, as for
# file-based stores (e.g., SQLiteStore) we can't check for objects with os.path.exists since
# there are not directories for groups (they are in the file).
"""
# In Zarr the path is a combination of the path of the store and the path of the object. So we first need to
# merge those two paths, then remove the path of the file, add the missing leading "/" and then compute the
# directory name to get the path of the parent
Expand All @@ -572,6 +613,7 @@ def get_zarr_paths(zarr_object):
objectpath = "/" + os.path.relpath(fullpath, filepath)
# return the result
return filepath, objectpath
"""

@staticmethod
def get_zarr_parent_path(zarr_object):
Expand Down Expand Up @@ -636,10 +678,21 @@ def resolve_ref(self, zarr_ref):
target_name = os.path.basename(object_path)
else:
target_name = ROOT_NAME
target_zarr_obj = zarr.open(source_file, mode='r')
# Open the source_file containing the link. We here need to determine the correct zarr.storage store to use
try:
target_zarr_file = zarr.open(source_file, mode='r')
except zarr.errors.FSPathExistNotDir:
try:
fstore = SQLiteStore(source_file)
target_zarr_file = zarr.open(fstore, mode='r')
self.__opened_stores_references.append(fstore)
except Exception:
raise ValueError("Found bad link to object %s in file %s" % (object_path, source_file))
# Get the linked object from the file
target_zarr_obj = target_zarr_file
if object_path is not None:
try:
target_zarr_obj = target_zarr_obj[object_path]
target_zarr_obj = target_zarr_file[object_path]
except Exception:
raise ValueError("Found bad link to object %s in file %s" % (object_path, source_file))
# Return the create path
Expand Down
Loading
Loading