Improve Filter interface #17

minnerbe · 2023-03-15T16:07:45Z

Trying to implement variable length string arrays #16 made it evident that having numcodecs-like filters would be a huge step towards supporting general data types; this is the result of multiple discussions with @bogovicj and @axtimwalde, see also this corresponding issue.

However, the Filter interface in its current state contains no methods and is not used anywhere. I suggest to flesh out the Filter interface such that implementers of this interface

are de-/serializable from/to json with an annotation interface similar to @CompressionType;
can be daisy-chained.

For the second point to work, methods for application and inverse application of a filter have to have the same input and output type. I see two possibilities here for this type:

Plain buffers (this is the case in numcodecs). This would either require to change the BlockReader and BlockWriter interfaces to work with buffers instead of DataBlocks, which seems unnatural given their names, or to manually expose the raw data of a DataBlock after creation, which seems to go against the intention of the concept.
DataBlocks. This would allow filters to create a new DataBlock if necessary (e.g., the size of raw data changes), or modify the data in-place if possible.

Adding filters allows to de-/serialize custom objects in a way that is compatible with the Python implementation of zarr.
A downside of this would be that, for general objects, a DataBlock cannot know the number of deserialized bytes before deserialization. This would probably necessitate some changes in the DataBlock interface and in the way DataBlocks created in the reading process (right now, they pre-allocate an array of the right size to hold the decompressed data).

The text was updated successfully, but these errors were encountered:

mkitti · 2023-03-15T18:30:00Z

Consider using https://commons.apache.org/proper/commons-compress/ as a generalized compression framework . Also consider upgrade to Blosc2.

bogovicj · 2023-03-16T14:20:21Z

related: saalfeldlab/n5#87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Filter interface #17

Improve Filter interface #17

minnerbe commented Mar 15, 2023

mkitti commented Mar 15, 2023

bogovicj commented Mar 16, 2023

Improve Filter interface #17

Improve Filter interface #17

Comments

minnerbe commented Mar 15, 2023

mkitti commented Mar 15, 2023

bogovicj commented Mar 16, 2023