Add production rules for dataset byte streams #56
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is my proposal to start a discussion...
Few explanations:
Because storage layout information is available somewhere else I did not make a distinction between byte streams of a chunked dataset vs. a single byte stream of a contiguous dataset. The
byteStreams
key will always hold an array of byte stream information.For the same reason, each byte stream information will have its location in the dataset's dataspace as
dspace_anchor
key. For contiguous datasets, its value will always be[0, 0, ...]
.Checksum information has two keys: type (MD5, SHA1, a URI, etc.) and value. The type information is repeated for every byte stream but I wanted to allow having byte stream checksums of different types.
Checksum value's spec describes it simply as an ASCII string without the slash but we may want to be more accurate here.