-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs-2.1.14 patchset #15601
zfs-2.1.14 patchset #15601
Conversation
Over its history this the dirty dnode test has been changed between checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and `dn_dirty_record`. de198f2 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency 2531ce3 Revert "Report holes when there are only metadata changes" ec4f9b8 Report holes when there are only metadata changes 454365b Fix dirty check in dmu_offset_next() 66aca24 SEEK_HOLE should not block on txg_wait_synced() Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9 It turns out both are actually required. In the case of appending data to a newly created file, the dnode proper is dirtied (at least to change the blocksize) and dirty records are added. Thus, a single logical operation is represented by separate dirty indicators, and must not be separated. The incorrect dirty check becomes a problem when the first block of a file is being appended to while another process is calling lseek to skip holes. There is a small window where the dnode part is undirtied while there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)` would not know that the file is dirty, and would go to `dnode_next_offset()`. Since the object has no data blocks yet, it returns `ESRCH`, indicating no data found, which results in `ENXIO` being returned to `lseek()`'s caller. Since coreutils 9.2, `cp` performs sparse copies by default, that is, it uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to replicate the holes in the target. When it hits the bug, its initial search for data fails, and it goes on to call `fallocate()` to create a hole over the entire destination file. This has come up more recently as users upgrade their systems, getting OpenZFS 2.2 as well as a newer coreutils. However, this problem has been reproduced against 2.1, as well as on FreeBSD 13 and 14. This change simply updates the dirty check to check both types of dirty. If there's anything dirty at all, we immediately go to the "wait for sync" stage, It doesn't really matter after that; both changes are on disk, so the dirty fields should be correct. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#15571 Closes openzfs#15526
Thanks @tonyhutter. |
@tonyhutter So the dkms installation mess (modules not being built, broken dkms status) is not carried on (2.1.x -> 2.1.y and 2.1.x -> 2.2.x upgrades). |
@tonyhutter One other thing: |
It would be nice to have a fix for: |
When doing a manual TRIM on a zpool, the metaslab being TRIMmed is potentially re-enabled before all queued TRIM zios for that metaslab have completed. Since TRIM zios have the lowest priority, it is possible to get into a situation where allocations occur from the just re-enabled metaslab and cut ahead of queued TRIMs to the same metaslab. If the ranges overlap, this will cause corruption. We were able to trigger this pretty consistently with a small single top-level vdev zpool (i.e. small number of metaslabs) with heavy parallel write activity while performing a manual TRIM against a somewhat 'slow' device (so TRIMs took a bit of time to complete). With the patch, we've not been able to recreate it since. It was on illumos, but inspection of the OpenZFS trim code looks like the relevant pieces are largely unchanged and so it appears it would be vulnerable to the same issue. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jason King <[email protected]> Illumos-issue: https://www.illumos.org/issues/15939 Closes openzfs#15395
ab9a6b6
to
dad4659
Compare
The order in fs/Makefile doesn't matter, the order in fs/Kconfig is preserved (ext2 is included as the first thing in the first if BUILD block, and only once), but I don't think it matters much either, realistically Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#13316
META file and changelog updated. Signed-off-by: Tony Hutter <[email protected]>
dad4659
to
d99134b
Compare
Closing. 2.1.14 has been tagged. |
Motivation and Context
Patchset for 2.1.14. This release was put out to include the fix for dirty dbuf corruption: #15526
Description
Include fix for data corruption. Full details in: #15526.
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.