-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecating hardcoded OFFSET constant #1
Comments
Writing out an HDF5 file would be enormously helpful and basically solve all implementation problems on the read side. As I understand it, the scope fills up a memory buffer as it goes and then writes out to a file all at the end, which gives quite a lot of latitude in terms of juggling the numbers around before the write (e.g. splitting channels into separate datasets, writing valid metadata as it pertains to both the group and the channels etc.). If giving a flexible offset is the first step, I'm all for it. |
Hi @clbarnes, My understanding is that "2D scan Tclk.vi" is the main image writer component of the software. The top half puts data into a queue, a memory buffer, and the bottom half reads from that queue and writes to disk. However, this happens concurrently, but not synchronously. That is we do not have to wait for data to be written to disk before acquiring new data. The bottom component has two hard coded values. byte offset The current way we have implemented of converting the dat file to a HDF5 file is just using a reader to read in the dat file and then writing out an equivalent HDF5 file, perhaps with the image data chunked and compressed. An alternative way would be to reopen the file and write in a HDF5 header somewhere. For example, if we moved the DAT header somewhere else, we could overwrite the DAT header with a HDF5 header at the beginning of the file, and then add the attributes to the HDF5 file. The only advantage of this approach is that the image data does need to be rewritten to obtain a HDF5 file. This also could be done during transmission of the file off the acquisition computer. We will likely proceed with the current method of resaving the entire file in the near term. -Mark Zoom in of the bottom file writing component of "2D scan Tclk.vi" |
Got it, thank you! I suppose both cases of rewriting the file or just writing HDF5 metadata into it requires a reader to be co-maintained with the microscope software, which needs to be robust, scaleable, relatively standalone etc.. In which case we may as well just use that tool to do whatever conversion we need, HDF5/zarr/N5/TIFFs/npy/ whatever. The value of having the scope software just generate a valid HDF5 to begin with is that everyone's starting point looks the same, but if that's not an option, it's not an option. |
Is it really not an option though? Is there centralized maintenance of the acquisition software? If not, then someone could go ahead and just add the hdf5 writing functionality and the problem is solved (for that person / group) |
Proposal: Making
OFFSET
, the start of the array image data, a variableCurrently, .dat readers typically code
OFFSET
as a fixed constant,1024
. This value encodes the start of the array data and the end of the attribute metadata.For forward-compatibility, in anticipation of the need for additional metadata to support multiple microscopes, I propose making
Offset
an independent variable rather than a fixed constant. The variable could either be a directly encoded attribute or can be calculated from other existing attributes.Proposal: An attribute for
Offset
at offset 992An independent attribute for
Offset
would be a robust solution as it can uniquely delineate the separation between metadata and array data. I propose byte offset 992 for the location for unsigned 64-bit integer in big endian format for consistency.0x0000000000000000
indicates that theOffset
value should be calculated or assumed to be1024
to enable backwards compatibility.0xffffffffffffffff
indicates that there is no meaningfulOffset
for contiguous array data. For example, this may indicate that the array data is chunked and/or compressed.Proposal: Calculate
Offset
fromFileLength
and other attributesAlternatively,
Offset
could be calculated from theFileLength
attribute which currently indicates the end of the array data and is stored at byte offset 1000 as a 64-bit big endian integer. Since the length of the array can be calculated as the product ofXresolution
,Yresolution
,Number_of_channels
, and the size of the datatype, theOffset
value can be calculated from theFileLength
as follows.A new special value for
FileLength
is0xffffffffffffffff
which indicates that theFileLength
should be interpreted as the actual end of the file and may not be a reliable value from which to calculate the array data offset. For example, this value should be used if the array data is chunked and/or compressed or the CSV recipe data is no longer indicated in the trailer of the file.Example application: Proposed hybrid DAT/HDF5 file
If this proposal is implemented, a hybrid DAT/HDF5 becomes possible, where the extra metadata space is used to contained HDF5 metadata according to the HDF5 file metadata.
A simple contiguous HDF5 file can accommodate a userblock of 1 KB or some doubling thereof. This userblock can accomodate the existing DAT file metadata. The HDF5 metadata header can be contained within a subsequent 2 KB by written by the HDF5 library with an early allocation flag. A hybrid-HDF5/dat file could be made if the
OFFSET
were shifted to 3 KB (3072 bytes).A potential modification to the LabView writer consists of changing a single constant from
1024
to3072
. The details of the potential writer modification are out of scope for this proposal.To clarify, the scope of this proposal applies solely to DAT file readers.
Reader References
The text was updated successfully, but these errors were encountered: