-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs: backport patches for block cloning tunable and default disable; zfsUnstable: 2.2.1-unstable-2023-10-21 -> 2.2.1 #269465
Conversation
Signed-off-by: Jörg Thalheim <[email protected]>
Due to openzfs/zfs#15533, we do not want to update to 2.2.1, but we *do* want patches from 2.2.1 to address block cloning issues (see openzfs/zfs#15529). The first patch does not apply cleanly due to a trivial conflict in the context, so vendor the patch.
@@ -25,4 +25,12 @@ callPackage ./generic.nix args { | |||
version = "2.2.0"; | |||
|
|||
sha256 = "sha256-s1sdXSrLu6uSOmjprbUa4cFsE2Vj7JX5i75e4vRnlvg="; | |||
|
|||
extraPatches = [ | |||
./brt-tunable.patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should check this in our zfs nixos test?
% zpool get all | grep feature@block_cloning
zroot feature@block_cloning disabled local
after having the flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pool feature will still be enabled, instead /sys/module/zfs/parameters/zfs_bclone_enabled
will be available and 0
.
It’s a bit tricky since with #269452 we will have a variant where we expect it to not be present at all (so we should handle that), but also vanilla 2.2.0 won’t have it either—but then we want to fail if it’s not present. From initial glance, makeZfsTest makes it a bit tricky to switch on the version… I guess could switch on the version in the test itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can definitely encode version checking inside makeZfsTest
if you want, you just have to use package
and look at its version.
This release is still exposed to a different issue openzfs/zfs#15533 (comment). Although it does not appear to result in any corrupted data. |
I don't think we should that deeply in zfs. Lets better stay on an older version than starting to patch things together. |
@SuperSandro2000 Generally I agree, but folks who have pools with active v2.2+ pool features cannot trivially downgrade. |
We may want to wait for 2.2.2? Block cloning does not appear to be the issue, and there is a partial mitigation for 2.2.0 or prevoius versions in |
I believe there are two things that need fixing:
As we now know block cloning doesn't appear to be the root cause for the corruption issue (openzfs/zfs#15526 (comment)) - so I think we should be fine with just the patch for (1). The zfsUnstable part of the PR is still affected by (2). I think we should be able to converge to a single proposal, having the discussion split among two PRs is a bit awkward. Can we close one? I feel updating to v2.2.1 and cherry picking the fix + revert makes the most sense. Or wait for upstream to release, though it would be nice to have a version of ZFS that doesn't eat data when NixOS 23.11 is released... For reference, the upstream issues: |
I'm hesitant to adopt openzfs/zfs#15571 even though this was patch was applied in zfs_2_1. While this may be the patch that is accepted, I am not the person to make that decision. As this bug has been around for a long time, and only become more common with recent changes, I propose we instead roll back those recent changes. My recommendation would be to proceed with 2.2.1, which disables block cloning, and follow Gentoo in patching zfs_dmu_offset_next_sync=0 back to the default. Their patch is here: https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-fs/zfs-kmod/files/zfs-kmod-2.2.1-Disable-zfs_dmu_offset_next_sync-tunable-by-default.patch These two should hopefully be enough to calm some fears while upstream has a chance to fix the underlying bug. |
I have been testing with https://github.com/numinit/nixpkgs/tree/zhammer (managed to repro it in NixOS tests) so we have some hard data to base our decision on. Summary here: openzfs/zfs#15526 (comment) Based on these data (10 million is still running) I would recommend 2.2.1 with @robn's patch (openzfs/zfs#15571), but I understand the hesitancy since it is not yet upstreamed. Note that it is still possible to repro with zfs_dmu_offset_next_sync=0, and I have done so in NixOS tests. |
I think it would be good to get something merged today if possible, if nothing else than to reduce the risk. If others feel comfortable applying openzfs/zfs#15571 then I'll support that. @amarshall seeing as this is your PR, what is your opinion and how would you like to proceed? |
Closing, see #269097 (comment) |
Description of changes
Alternative to #269097. Still bumping zfsUnstable to 2.2.1 as I feel that probably still makes sense.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Priorities
Add a 👍 reaction to pull requests you find important.