Checksums and data blocks #16080

micredd · 2024-04-10T23:37:34Z

micredd
Apr 10, 2024

I'm curious about the relationship between ZFS checksums and data blocks. Specifically, what does ZFS actually checksum: the low-level disk blocks (i.e. sectors) or the high-level records (i.e. zvol blocks or dataset files)? I have one source (iXsystems) saying it's the former, another source (a blog) saying it's the latter.

Why does this matter to me? Granularity of recovery. At minimum, the difference here determines whether I'd be left with something or nothing in a worst-case scenario.

Let's say I have a 1MB file and my dataset's record size is also 1MB, with my pool geometry consisting of a simple two-disk mirror. If a checksum is generated for each record, that means one corrupt byte of data would require recovering a full megabyte of data from the mirror disk. What if, by some fluke, a sector or two of that record was also lost on the mirror? Now neither checksum matches and I've lost that file completely. If ZFS calculates checksums for each disk block, it would be another story entirely, as (1) it could likely recover the record completely (the chance of the exact same sector going corrupt on both disks is tiny), or, (2) it could at least serve me the blocks that still match their checksum.

I'm sure some will say the chance of data being lost in two places for the same record is exponentially small, and they're probably right. But, as ZFS adds support for larger and larger record sizes, that exponent shrinks. :-)

amotin · 2024-04-11T00:22:04Z

amotin
Apr 11, 2024
Collaborator

Everything in ZFS (compression, encryption, checksum, dedup, etc) works over logical blocks (dataset recordsize, zvol volblocksize, etc), each pointed by a block pointer, actually including the checksum, compression and encryption parameters, etc. Only space allocation works in terms of ashift, which is close (or equal) to disk sectors.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checksums and data blocks #16080

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Checksums and data blocks #16080

micredd Apr 10, 2024

Replies: 1 comment

amotin Apr 11, 2024 Collaborator

micredd
Apr 10, 2024

amotin
Apr 11, 2024
Collaborator