A post-processing solution #7
Replies: 6 comments 9 replies
-
On second thoughts, I've removed the channel-dropping feature. The intention is for this to be a lossless and verifiable process, which is not the case if you drop a channel. As channels can be disabled at acquisition time, that seems a more true way of doing it. |
Beta Was this translation helpful? Give feedback.
-
Could you create a hdf2dat command line utility? Basically just write out |
Beta Was this translation helpful? Give feedback.
-
I added an issue gathering information about HDF5 (compression) filters here: |
Beta Was this translation helpful? Give feedback.
-
Overall, thank you for this. This looks very nice. @trautmane was just about to start working on this. |
Beta Was this translation helpful? Give feedback.
-
I am curious about the layout of the datasets without chunking or compression. The attributes might be scattered throughout the file in 2KB chunks. After h5py/h5py#2106 we should be able to consolidate these by using a larger |
Beta Was this translation helpful? Give feedback.
-
Incidentally, the HDF5 User Group Europe meeting is happening at the moment: |
Beta Was this translation helpful? Give feedback.
-
I split my post-acquisition conversion prototype into 2 repos:
dat2hdf5
anddat2hdf-verify
. The first converts to hdf5, without any userblock or external file shenanigans, splitting channels into datasets, turning all the metadata fields into hdf5 attributes, optionally adding chunking and compression, and allowing you to select channels. The latter generates the bytes of a .dat file from the HDF5 and takes their md5sum, comparing it to the md5sum of an actual dat file, and returns a status code of 1 if they're different. This should work regardless of chunking and compression (although it obviously won't if you've dropped a channel) and optionally deletes the dat if successful.The converter stores both the raw header and the footer as byte arrays in the hdf5 attributes. When verifying, the footer is just added blindly to the end, but the header is re-serialised from the metadata split across hdf5 attributes (no cheating!).
New header versions just need a new TSV in jeiss-specs and nothing should need to change in jeiss-convert other than bumping the version of the submodule.
I'd plan to use this when moving data off the acquisition machine and onto the primary storage server.
Beta Was this translation helpful? Give feedback.
All reactions