-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add read for neurodata_types, e.g., Container, TimeSeries #91
base: add_read
Are you sure you want to change the base?
Conversation
…up types, e.g., Data, VectorData etc.
…nage container classes
@stephprince please take a look. This is an accompanying PR to #85 to implement data read, but this PR specifically focuses on read for typed objects (e.g., a |
@stephprince this PR is good to review. |
// TODO: creating a new I/O makes the read fail. | ||
// std::shared_ptr<BaseIO> readio = createIO("HDF5", path); | ||
// readio->open(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephprince it looks like there is an issue with how HDF5IO opens an existing file. Read works fine when I use the existing HDF5IO object that was used for write, but when I close the file and then open it again reading any data from the file fails. When I tried this in the testBase.cpp
it corrupted the file (i.e,. I think it may be opened in "write" mode instead of "read" or "append").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephprince this may be related to the issue you encountered on Windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I will look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am having trouble reproducing this on the main branch.
If I add these lines to the end of testBase.cpp
I am able to create a new IO object and read the data back in.
io->close();
// test reopening the file to read the data back in
std::shared_ptr<BaseIO> readio = createIO("HDF5", path);
readio->open();
double* tsBuffer2 = new double[numSamples];
std::unique_ptr<BaseRecordingData> tsetDset2 =
readio->getDataSet(dataPath + "/timestamps");
std::unique_ptr<HDF5::HDF5RecordingData> tsH5Dataset2(
dynamic_cast<HDF5::HDF5RecordingData*>(tsetDset2.release()));
readH5DataBlock(tsH5Dataset2->getDataSet(), timestampsType, tsBuffer2);
std::vector<double> tsRead2(tsBuffer2, tsBuffer2 + numSamples);
delete[] tsBuffer2;
REQUIRE(tsRead2 == timestamps);
Do you remember if there were any different steps or conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming. A simple test is to just uncomment the two lines here where the readio
object is being created and then use the readio
object instead of the original io
object in the code that follows. We'd also need to call io.close
first since we don't have a way yet to open as read-only with SWMR-read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not positive since there were some code changes since these comments, but I think the main issue was that readDataWrapper
was being used here instead of readElectricalSeriesData
obtained from the read_io
object.
auto readElectricalSeriesData = readElectricalSeries->dataLazy();
DataBlock<float> readDataValues = readDataWrapper->values<float>();
This test example now works using the read_io
object, but the file is being re-opened in read/write mode. I will add an option to open in read-only mode and demonstrate that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main issue was that
readDataWrapper
was being used here instead ofreadElectricalSeriesData
Makes sense. Thanks for catching that mistake.
I will add an option to open in read-only mode and demonstrate that here.
For read-only mode we should also allow SWMR read mode.
@stephprince just FYI. I pushed a few more updates here. The main structural changes are: 1) I created a Next, I'll need to:
|
…s to check that an object actually exists
…ts contained in a group
@stephprince I implemented this now as follows:
|
@stephprince if it's easier for you to review, I'd be fine with merging this back to #85 |
@oruebel sounds great!
Can we keep it separate for now? I think I had read most of the docs and examples for this one and I want to keep track of where I left off. |
Sounds good. Whichever works best for you is fine with me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good and the docs explained things very well, mainly some very small suggestions for clarity.
I still need to look into the issue with HDF5IO
opening an existing file.
Also, once the small suggestions are updated, if it's easier to combine the two read PRs I'm good with that.
Co-authored-by: Steph Prince <[email protected]>
Co-authored-by: Steph Prince <[email protected]>
@stephprince I added the changes you suggested and synced the PR with the |
This PR builds on #85. I made this a separate PR to make it a) easier to review and b) because this addresses the specific problem of reading a typed Container (e.g., a TimeSeries) and I wanted to make it easier to see the changes related to reading typed objects. Ultimately, the goal of this PR is to enable us to read any object with an assigned neurodata_type (for which we have a corresponding class) from an NWB file.
How does read look like?
To read a type (whether a whole NWBFile or a single TimeSeries) , we need to know: 1) the full name of the type, 2) the path of the object in the file, and 3) the io object for reading.
Variant 1: User specifies name of type
Variant 2: Users provides the type class as a template
or simply create the instance of the class ourselves:
Variant 3: We read the full name of type from the file directly
In this variant we read the full name of the type from the NWB file directly, which is a combination of the attributes
namespace
andneurodata_type
that are stored for each type.How does this work
I added more details about the design for this on a new page in the developer docs.
Main changes / TODO
RegisteredType
as common base class for all neurodata_types defined in the NWB (or HDMF) schema, i.e., for both Dataset and Group typesRegisteredType
registryData
,VectorData
, andElementIdentifiers
to provide the standard constructor required byRegisteredType
. I.e., these types now also have theio
and their path stored.RegisterType
to manage all types based on the namespace and type-nameRegisteredType
classRegisteredClass
(i.e., a new data type class)SpikeEventSeries
, e.g., to removeaqnwb/src/nwb/ecephys/SpikeEventSeries.hpp
Lines 60 to 63 in a6d86f9
ElectrodeTable
is not a type in NWB yet, as such we need to write withDynamicTable
as theneurodata_type
instead. Updated theRegisteredType
class to allow manual overwrite of the neurodata_type when the classname is not the same as the neurodata_type.mergePaths
function inUtils.hpp
as a more reliable way to merge HDF5 pathsUtils.hpp
to bestatic inline
to avoid linker problem with multiple definitions of the functions and still be able to inline the functions (i.e. without having to move to aUtils.cpp
DEFINE_FIELD
inRegisteredType
to simplify the creation of getter functions that returnReadDataWrapper
object for lazy read of a dataset or attributeReadDataWrapper.exists
BaseIO::attributeExists
andHDF5IO::attributeExists
(we already hadBaseIO::objectExists
for datasets/groups but not for attributes)TimeSeries
from /acquisition)BaseIO::getGroupObjects
andHDF5IO::getGroupObjects
to allow us to get a list of all contents of a groupBaseIO::findTypes
to allow us to search for all objects of a given set of types. This function only uses functions defined onBaseIO
so it is backend independent and does not need to be overridden inHDF5IO
findTypes
on readm_
naming convention:ReadDataWrapper
objects for lazy read of datasets and attributes. These can be created with the new DEFINE_FIELD macro.HDF5IO
in read-only mode (probably using SWMR read)