Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TermSetWrapper and write support #950

Merged
merged 74 commits into from
Sep 28, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
205f763
working concept
mavaylon1 Aug 29, 2023
c9a89cc
minor cleaning
mavaylon1 Aug 29, 2023
7980937
foo file
mavaylon1 Aug 29, 2023
5f02860
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 29, 2023
7154ac5
checkpoint
mavaylon1 Sep 6, 2023
a419902
Merge branch 'wrapper' of https://github.com/hdmf-dev/hdmf into wrapper
mavaylon1 Sep 6, 2023
561e279
checkpoint
mavaylon1 Sep 6, 2023
f677647
Update src/hdmf/utils.py
mavaylon1 Sep 6, 2023
a63fe06
clean up
mavaylon1 Sep 6, 2023
6e7bbc6
checkpoint
mavaylon1 Sep 6, 2023
a0fdb24
tests placeholders
mavaylon1 Sep 6, 2023
e7034de
checkpoint
mavaylon1 Sep 8, 2023
afe5dd5
placeholder
mavaylon1 Sep 11, 2023
92bf180
placeholder
mavaylon1 Sep 11, 2023
2c8d6da
placeholder
mavaylon1 Sep 11, 2023
b698f5e
working write and herd
mavaylon1 Sep 11, 2023
1b7b3d5
cleanup
mavaylon1 Sep 11, 2023
c2c53a1
checkpoint on updating append
mavaylon1 Sep 11, 2023
4c513e5
integrate append
mavaylon1 Sep 11, 2023
c257b02
Merge branch 'dev' into wrapper
mavaylon1 Sep 11, 2023
d870133
test checkpoint
mavaylon1 Sep 18, 2023
104a7aa
test checkpoint
mavaylon1 Sep 19, 2023
ae6655a
test fixes
mavaylon1 Sep 19, 2023
86d5aa8
termset tests
mavaylon1 Sep 19, 2023
5bf83de
termset tests
mavaylon1 Sep 19, 2023
4f5d833
termset tests
mavaylon1 Sep 19, 2023
abbf12a
checkpoint/remove field_name
mavaylon1 Sep 26, 2023
e0864e8
cleanup
mavaylon1 Sep 26, 2023
7534c0d
make sure things pass without bad tests
mavaylon1 Sep 26, 2023
ab64a7d
cleanup
mavaylon1 Sep 26, 2023
bcc69d7
temp fix for test
mavaylon1 Sep 26, 2023
a872b59
termset tutorial
mavaylon1 Sep 26, 2023
d1c987e
tests and bug fix on write
mavaylon1 Sep 27, 2023
da1b006
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 27, 2023
03a51cf
tests and bug fix on write
mavaylon1 Sep 27, 2023
8fa4a9a
Merge branch 'wrapper' of https://github.com/hdmf-dev/hdmf into wrapper
mavaylon1 Sep 27, 2023
0e5f96e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 27, 2023
0fe2fb8
ruff
mavaylon1 Sep 27, 2023
8899b77
Merge branch 'wrapper' of https://github.com/hdmf-dev/hdmf into wrapper
mavaylon1 Sep 27, 2023
b29a691
bug fix
mavaylon1 Sep 27, 2023
b3ac0a4
doc
mavaylon1 Sep 27, 2023
4556c2a
doc
mavaylon1 Sep 27, 2023
b87c323
Update test_docval.py
mavaylon1 Sep 27, 2023
c60a68b
tests
mavaylon1 Sep 27, 2023
8d11383
tests
mavaylon1 Sep 27, 2023
8718dae
tests
mavaylon1 Sep 27, 2023
9c28957
Update utils.py
mavaylon1 Sep 28, 2023
80c1b3e
Update utils.py
mavaylon1 Sep 28, 2023
f14efdf
Update utils.py
mavaylon1 Sep 28, 2023
83bf3b8
ryan feedback
mavaylon1 Sep 28, 2023
2aa51f6
Update src/hdmf/build/objectmapper.py
mavaylon1 Sep 28, 2023
622fcc1
Update docs/gallery/plot_term_set.py
mavaylon1 Sep 28, 2023
14fddb0
Update docs/gallery/plot_term_set.py
mavaylon1 Sep 28, 2023
661b958
Update docs/gallery/plot_term_set.py
mavaylon1 Sep 28, 2023
d910334
Update docs/gallery/plot_term_set.py
mavaylon1 Sep 28, 2023
4c7610f
tutorial
mavaylon1 Sep 28, 2023
6879676
Update CHANGELOG.md
mavaylon1 Sep 28, 2023
6191238
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 28, 2023
33ba2a5
test next
mavaylon1 Sep 28, 2023
b3895d9
Merge branch 'wrapper' of https://github.com/hdmf-dev/hdmf into wrapper
mavaylon1 Sep 28, 2023
753468f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 28, 2023
8db5c2d
format
mavaylon1 Sep 28, 2023
9a37ecf
format
mavaylon1 Sep 28, 2023
f2504e3
validation changes
mavaylon1 Sep 28, 2023
6d38277
Update tests/unit/test_term_set.py
rly Sep 28, 2023
b783ace
clean up
mavaylon1 Sep 28, 2023
be3a17c
Update io.py
rly Sep 28, 2023
f1732ac
Update CHANGELOG.md
rly Sep 28, 2023
5d67899
tuple change
mavaylon1 Sep 28, 2023
91fef88
Merge branch 'wrapper' of https://github.com/hdmf-dev/hdmf into wrapper
mavaylon1 Sep 28, 2023
2fc236f
Update tests/unit/test_term_set.py
rly Sep 28, 2023
317865c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 28, 2023
dc1d868
Update src/hdmf/term_set.py
rly Sep 28, 2023
3536940
test feedback
mavaylon1 Sep 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/nwbfile_test.nwb
Binary file not shown.
52 changes: 52 additions & 0 deletions docs/write_foo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
from datetime import datetime
from uuid import uuid4

import numpy as np
from dateutil.tz import tzlocal

from pynwb import NWBHDF5IO, NWBFile
from pynwb.ecephys import LFP, ElectricalSeries

from hdmf import TermSetWrapper as tw
from hdmf import Data
from hdmf import TermSet
terms = TermSet(term_schema_path='/Users/mavaylon/Research/NWB/hdmf2/hdmf/docs/gallery/example_term_set.yaml')

import numpy as np

from pynwb import TimeSeries

data = np.arange(100, 200, 10)
timestamps = np.arange(10)

from hdmf.backends.hdf5.h5_utils import H5DataIO

test_ts = TimeSeries(
name="test_compressed_timeseries",
data=H5DataIO(data=data, compression=True),
unit=tw(item="SIunit", termset=terms),
timestamps=timestamps,
)

nwbfile = NWBFile(
session_description="my first synthetic recording",
identifier=str(uuid4()),
session_start_time=datetime.now(tzlocal()),
experimenter=[
"Baggins, Bilbo",
],
lab="Bag End Laboratory",
institution="University of Middle Earth at the Shire",
experiment_description="I went on an adventure to reclaim vast treasures.",
session_id="LONELYMTN001",
)
nwbfile.add_acquisition(test_ts)

filename = "nwbfile_test.nwb"
with NWBHDF5IO(filename, "w") as io:
io.write(nwbfile)

# open the NWB file in r+ mode
with NWBHDF5IO(filename, "r+") as io:
read_nwbfile = io.read()
breakpoint()
2 changes: 1 addition & 1 deletion src/hdmf/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from . import query
from .backends.hdf5.h5_utils import H5Dataset, H5RegionSlicer
from .container import Container, Data, DataRegion, HERDManager
from .container import Container, Data, DataRegion, HERDManager, TermSetWrapper
from .region import ListSlicer
from .utils import docval, getargs
from .term_set import TermSet
Expand Down
4 changes: 3 additions & 1 deletion src/hdmf/backends/hdf5/h5tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from ..warnings import BrokenLinkWarning
from ...build import (Builder, GroupBuilder, DatasetBuilder, LinkBuilder, BuildManager, RegionBuilder,
ReferenceBuilder, TypeMap, ObjectMapper)
from ...container import Container
from ...container import Container, TermSetWrapper
from ...data_utils import AbstractDataChunkIterator
from ...spec import RefSpec, DtypeSpec, NamespaceCatalog
from ...utils import docval, getargs, popargs, get_data_shape, get_docval, StrDataset
Expand Down Expand Up @@ -1099,6 +1099,8 @@ def write_dataset(self, **kwargs): # noqa: C901
dataio = data
link_data = data.link_data
data = data.data
if isinstance(data, TermSetWrapper):
data = data.item
mavaylon1 marked this conversation as resolved.
Show resolved Hide resolved
else:
options['io_settings'] = {}
attributes = builder.attributes
Expand Down
7 changes: 4 additions & 3 deletions src/hdmf/build/objectmapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
ConstructError)
from .manager import Proxy, BuildManager
from .warnings import MissingRequiredBuildWarning, DtypeConversionWarning, IncorrectQuantityBuildWarning
from ..container import AbstractContainer, Data, DataRegion
from ..container import AbstractContainer, Data, DataRegion, TermSetWrapper
from ..data_utils import DataIO, AbstractDataChunkIterator
from ..query import ReferenceResolver
from ..spec import Spec, AttributeSpec, DatasetSpec, GroupSpec, LinkSpec, RefSpec
Expand Down Expand Up @@ -564,6 +564,8 @@ def get_attr_value(self, **kwargs):
msg = ("%s '%s' does not have attribute '%s' for mapping to spec: %s"
% (container.__class__.__name__, container.name, attr_name, spec))
raise ContainerConfigurationError(msg)
if isinstance(attr_val, TermSetWrapper):
attr_val = attr_val.item
mavaylon1 marked this conversation as resolved.
Show resolved Hide resolved
if attr_val is not None:
attr_val = self.__convert_string(attr_val, spec)
spec_dt = self.__get_data_type(spec)
Expand Down Expand Up @@ -906,7 +908,7 @@ def __add_attributes(self, builder, attributes, container, build_manager, source
if spec.value is not None:
attr_value = spec.value
else:
attr_value = self.get_attr_value(spec, container, build_manager)
attr_value = self.get_attr_value(spec, container, build_manager) # here
if attr_value is None:
attr_value = spec.default_value

Expand Down Expand Up @@ -937,7 +939,6 @@ def __add_attributes(self, builder, attributes, container, build_manager, source
if attr_value is None:
self.logger.debug(" Skipping empty attribute")
continue

builder.set_attribute(spec.name, attr_value)

def __set_attr_to_ref(self, builder, attr_value, build_manager, spec):
Expand Down
77 changes: 76 additions & 1 deletion src/hdmf/container.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import pandas as pd

from .data_utils import DataIO, append_data, extend_data
from .utils import docval, get_docval, getargs, ExtenderMeta, get_data_shape, popargs, LabelledDict
from .utils import docval, docval_macro, get_docval, getargs, ExtenderMeta, get_data_shape, popargs, LabelledDict
from hdmf.term_set import TermSet


Expand Down Expand Up @@ -1485,3 +1485,78 @@ def from_dataframe(cls, **kwargs):
if name is None:
return cls(data=data)
return cls(name=name, data=data)


class TermSetWrapper:
"""
This class allows any HDF5 group, dataset, or attribute to have a TermSet.

In HDMF, a group is a Container object, a dataset is a Data object,
an attribute can be a reference type to an HDMF object or a base type, e.g., text.
"""
# @docval({'name': 'termset',
# 'type': TermSet,
# 'doc': 'The TermSet to be used.'},
# {'name': primitive})
def __init__(self, **kwargs):
item, termset = popargs('item', 'termset', kwargs)

self.__item = item
self.__termset = termset
self.__validate()

def __validate(self):
# check if list, tuple, array, or DataIO
if isinstance(self.__item, (list, np.ndarray, tuple, Data, DataIO, dict)):
values = self.__item
# create list if none of those
else:
values = [self.__item]
# iteratively validate
bad_values = []
for term in values:
validation = self.__termset.validate(term=term)
if not validation:
bad_values.append(term)
if len(bad_values)!=0:
msg = ('"%s" is not in the term set.' % ', '.join([str(item) for item in bad_values]))
raise ValueError(msg)

@property
def item(self):
return self.__item

@property
def termset(self):
return self.__termset

@property
def dtype(self):
return self.__getattr__('dtype')

def __getattr__(self, val):
"""
This method is to get attributes that are not defined in init.
This is when dealing with data and numpy arrays.
"""
if val in ('data', 'shape', 'dtype'):
return getattr(self.__item, val)

def __getitem__(self, val):
"""
This is used when we want to index items.
"""
return self.__item[val]

def __next__(self):
"""
We want to make sure all iterators are still valid.
"""
return self.__item.__next__()


def __iter__(self):
"""
We want to make sure our wrapped items are still iterable.
"""
return self.__item.__iter__()
3 changes: 3 additions & 0 deletions src/hdmf/data_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1114,5 +1114,8 @@ def valid(self):
return self.data is not None





class InvalidDataIOError(Exception):
pass
9 changes: 8 additions & 1 deletion src/hdmf/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,8 @@ def __type_okay(value, argtype, allow_none=False):
elif argtype == 'bool':
return __is_bool(value)
return argtype in [cls.__name__ for cls in value.__class__.__mro__]
# elif isinstance(value, TermSetWrapper):
# pass
elif isinstance(argtype, type):
if argtype is int:
return __is_int(value)
Expand Down Expand Up @@ -214,7 +216,6 @@ def __parse_args(validator, args, kwargs, enforce_type=True, enforce_shape=True,
future_warnings = list()
argsi = 0
extras = dict() # has to be initialized to empty here, to avoid spurious errors reported upon early raises

try:
# check for duplicates in docval
names = [x['name'] for x in validator]
Expand Down Expand Up @@ -273,6 +274,12 @@ def __parse_args(validator, args, kwargs, enforce_type=True, enforce_shape=True,
type_errors.append("missing argument '%s'" % argname)
else:
if enforce_type:
from .container import TermSetWrapper # circular import fix
termset = False
if isinstance(argval, TermSetWrapper):
# kwargs is the dict that stores the object names and the values
# we can use this to unwrap the dataset/attribute to use the "item" for docval to validate the type.
argval = kwargs[argname].item
mavaylon1 marked this conversation as resolved.
Show resolved Hide resolved
if not __type_okay(argval, arg['type']):
if argval is None:
fmt_val = (argname, __format_type(arg['type']))
Expand Down