-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow block cloning across encrypted datasets. #14705
Conversation
Block cloning is now possible between two encrypted datasets that share the same encryption root. Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Can we add an encrypted block cloning test with the test suite? |
What happens if you clone across datasets, and then use |
Also, what happens when you have two encryption roots, I'm no expert on ZFS internals, but to throw out a few ideas:
|
And nobody knows the answer to such an important question? |
I have a gut feeling about the answer (namely: "it would be bad") but its more that these kind of edge cases need to be studied and understood (and if there's a problem, solved) before this could go live. Its not a big deal, just needs someone with enough skill and time to look into it. |
I guess that zfs change-key should create a new copy otherwise bad things might happen. |
[UPDATE] This post contains errors. Leaving it as originally written, see corrections further down.
I'm not following how Suppose we start with two datasets A block Now, we clone When we read a file in (Aside: The docs say the salt and derived key are cached when encrypting blocks to deriving the key needlessly. Is it also cached when reading blocks?) Back to our question: What happens when we We wrap We need to ensure that our earlier assumption always holds: If we clone a block from With that, cloning across encrypted datasets should "just work". But there may be additional security considerations. (Caveat: I'm an amateur!)
|
I wanna check this stuff. But right now I am a little bit limited by time. I quess I can do this until the end of this month or middle of next month. Quite unsure, my calender is full. I'm also working on some better key handling. I wanna improve this PR #14836 and has some of the work already done. While working on this I'm going to try to understand how zfs handles the encrytption and wanna check this PR as well. And wanna check how about clones and changing keys on datasets. |
@Majiir thanks for the analysis. When I originally asked "what if you change encryption root" I didn't know very much about OpenZFS encryption. These days I know a little more, but still not very much. I think you're right insofar as changing the encryption root only updates the wrapping keys, but I think it we still can't clone between arbitrary encrypted datasets.
So as far as I understand it, each dataset has its own master key, except for snapshots and clones created from snapshots, which share the master key of their parent (that is, anything that causes Given that, it seems like it should be possible to allow block cloning within a dataset, its snapshots and their clones, because they share a master key. However, that might not be safe to do in terms of information leakage. The "CONSIDERATIONS FOR DEDUP" commentary at the top of But even then, that comment makes it clear that this still only allows dedup within "clone groups". Which makes sense if they are still reliant on their master key. So I reckon cloned blocks between a dataset, it's snapshots, and clones from those snapshots will work, because the dedup precedent suggests that it's safe to do generally, and we can use the blocks as-is because we don't need the additional signal to reuse the block. But, it won't work between arbitrary datasets, because the master keys are different. And the changing encryption root is irrelevant! If so, that's pretty great, since cloning from a snapshot back to its primary dataset is the major use case for cross dataset cloning. |
I gave this patch a testing shot and it looks good.
|
Test resultsI'm working on writing up these test cases in a NixOS VM test. I was able to reproduce @mmatuska's results, which are notable because they appear to contradict the statement that different datasets created in the same encryption root have different master keys. With my test setup, I took a stab at answering a few questions from upthread:
If the datasets originally share an encryption root, then the clone works. That would imply that the two datasets also have the same master key. After If the datasets do not share an encryption root, then the clone doesn't happen (the blocks are copied normally). This happens even if the datasets have the same master key, so this is a missed opportunity to clone blocks.
If the two encryption roots were previously part of the same encryption root, and therefore have the same master key, then inheriting the same parent encryption root again seems to allow blocks to be cloned safely. If the two datasets began life with two different encryption roots, ... it still works?!?!?!??? I thought maybe something funny was happening with ZFS ARC, so I rebooted the VM. But the two files still hashed the same. Finally, I made it fail by rebooting the VM and hashing the cloned file first, before the original blocks were read (and presumably cached, unencrypted). Logs were spammed like this for each block:
Armed with that knowledge, I repeated the simple block cloning test across two datasets under the same encryption root, this time with a reboot between the cloning and the hashing, and by hashing the cloned target before reading the original file. Sure enough, the reads fail. So, what does this all mean?
Thoughts on master keysEach dataset having a different master key limits the utility of block cloning with encryption. Some use cases are still supported (like copying from a snapshot) but some attractive use cases are not, like moving or copying a file between different filesystems. On the other hand, the master keys being different allows a user to For block cloning to be most useful for encrypted datasets, I think we want two new capabilities:
|
I opened a PR with tests and a rebase to the current master. Don't know how to add commits to this PR. (sorry) |
We cannot check against the common encryption root. Here is a test that makes this not work:
Now the datasets rpool/ENC1/ds1 and rpool/ENC1/ds2 are under the same encryption root but have different master keys. The safest way would be to compare master keys (or check if the same master key is used), the second safest to allow only snapshots and clones of the same dataset because these have to use the same key. |
Replaced by #15544 which has been rebased. |
Block cloning is now possible between two encrypted datasets that share the same encryption root.
Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.