-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added argon2id as KDF. #14836
base: master
Are you sure you want to change the base?
Added argon2id as KDF. #14836
Conversation
Added Argon2 as a KDF which is memory and cpu intensive to prevent brute force attacks. There are a few more options available now: `keykdf`, `argon2_t_cost`, `argon2_m_cost`, `argon2_parallelism`, `argon2_version` (hiddden). The options are stored on the disks and the load-key function uses the correct kdf based on the information stored on the disks. Signed-off-by: Kay Pedersen <[email protected]>
You don't seem to have added a pool feature flag for this, which would cause things to break pretty badly on older systems. |
I don't think that's true. Nothing changes in the on-disk format; its just a few extra properties on the dataset. Older ZFS userspace won't necessarily be able to derive the key to unlock the dataset, but can still work with the pool, and the operator could just load the raw key in instead. For old pools on new userspace, it just means userspace needs to understand that a missing |
I'm going to strongly disagree that "it just fails to unlock the datasets and you could find a way to extract and feed in the raw key" is a reasonable "we don't need a feature flag" reason. |
This does not change the on-disk format. It does not prevent the pool being imported, scrubbed, or datasets being sent and recieved. It does not prevent datasets without encryption or using a non-argon2 key method from being mounted (or prevent using an alternate method to derive the key or even a newer userspace). Stopping the pool from being imported at all seems heavy-handed, and "break pretty badly" seems exaggerated. |
It turns out, writing new values where they did not exist before is a format change. It may be a compatible one, but "is not a format change" is inaccurate. It's possible the project will change its stance, but historically, "cannot mount the filesystem if in use" would be sufficient criteria for a feature flag. If you think that's too heavy handed, a more fine-grained feature approach might be nice, but currently, I would suggest this is definitely past the criteria of needing something to indicate a change, and it's not read-only compatible, so it would, indeed, require blocking import. |
I'm not trying to relitigate anything; I'm still fairly new here, and I'm happy to accept that my understanding is wrong. My understanding comes from reading docs and code and seeing that missing features prevent pool import entirely, and so should be avoided unless obviously necessary. This seems especially true for encryption when one of the selling points is that you don't need the keys in order to to recieve streams and manage the pool. I'm more than comfortable accepting that may be a wrong or incomplete conclusion though. I will note that similar observations that led me to not add a feature flag in #14249, and add #14577 to help in the future. Per-dataset feature flags are probably what we want here, and for my thing too. I've got a sketch of that on a branch, which I'll see if I can polish up soon. I wouldn't wait for me though; like everyone, I have limited time at the moment. |
I'm not sure if a feature-flag is important or not. I totally understand @rincebrain that the datasets are limited on older ZFS versions. Not all of them can be read and I don't know if send and recv is working without any problems. I made some test to check what is working and what isn't working, but not enough. |
It's not a blocker for now, but I'm just making a note of this: |
Oh, yes you are right. I will create a TODO List and add it to the first post later. So we can track the TODO's that are important or necessary to finish the pull request. |
Hi @oromenahar, thanks for working on this. Any thoughts on this PR vs #15682? It looks to me (as an outsider) like this PR neatly exposes the important parameters, but then the PR seems to have lost steam. The other PR looks like it would need some more work to get to about this point. [edit: fixed link] |
Hi @ongardie I am not happy with #15682 for the same reasons I'm not happy with my pr. For that reason this lost a little bit of steam. |
Hi @oromenahar any news on this? |
Added Argon2 as a KDF which is memory and cpu intensive to prevent brute force attacks.
There are a few more options available now:
keykdf
,argon2_t_cost
,argon2_m_cost
,argon2_parallelism
,argon2_version
(hiddden). The options are stored on the disks and the load-key function uses the correct kdf based on the information stored on the disks.Important note: Please be patient, it is my first PR on a big project and I tried my best, also I have just a little bit more than basic understanding of crypto.
Motivation and Context
PBKDF2
is a KDF which can be brute forced by ASICs or (GP)GPUs, encryption of data is getting more important but not all PCs or server provide a TPM to store secrets for the data encryption or the TPM can't be used for any reason. A KDF derives a key from a password provided by the user, which ZFS uses to enrypt the data. The KDF should be safe against several different types of attacks and Argon2 solves this problem. It is a modern Memory-Hard KDF which can use multiple CPUs and configurable RAM to prevent brute force attacks and it won the P-H-C. I think it should be possible to use a modern KDF directly in ZFS and not using a extra script which generates a raw key for ZFS.Two issues for more information: 14762, 10764.
Description
This features adds some options to store the configuration of the Argon2 options on disk, there are four new options.
keykdf
,argon2_t_cost
,argon2_m_cost
,argon2_parallelism
andargon2_version
which is hiddden. Thekeykdf
-option stores anone
,PBKDF2
orArgon2id
, theargon2_t_cost
-option is the equivalent topbkdf2iters
but for Argon2id. I think it's better to use a extraiter
-option for this, because based on the other options you configure it can be pretty hard to derive a key. For example 35,000 rounds and1<<20
KiB of memory will run forever, but 10 rounds will derive a key in a 'small' amount of time based on the hardware. The optionargon2_m_cost
configures how much memory is need to perform the key derivation it is set in KiB. The optionargon2_parallelism
stores how many CPUs should be used by the algorithm. The optionargon2_version
just stores the Argon2-Version and is not visible to the user, just for the programmer. This version can be changed and it should be stored I think. I have choosed some default options for CPU and memory.Openssl -> Substantial features doesn't support Argon2 until now or within a short amount of time, for that reason another library is needed. I searched for other crypto libs, which support Argon2 and found Argon2Lib. It looks like, that Cryptsetup also uses a copy of this for there Argon2 support. There are some libraries in other languages, like
C++
orRust
but I followed Cryptsetup and used the C-Lib Argon2Lib.Discussion
A second KDF is not a big problem right now, but if more and more KDFs are added the
PROP
-List is getting longer. I think it's good to have just a few but state of the art encryption/KDF algorithm.I thought about the default options for Argon2 but I'm not sure, for small hardware it could be to much but for bigger hardware it is not enough. If some datasets are allready open and in use, maybe there is not that much memory/CPU left to decrypt the next dataset. I think the default memory costs should be at
1<<20
KiB (1GiB). Also I think theT_COST
should be at least 10, but this need to be discussed with others as well.TODO
How Has This Been Tested?
I tested it with virtual debian and arch-linux machines and virtual disks. I have tested to upgrade the pool from an older version and also tried to use a older version to read a dataset which was used but not upgraded by the new version. If the dataset is enrypted with a password and PBKDF2 it is backward compatible. I ran some of the zfs-test and zpool-test scripts as well, the scripts I've ran finished without any error. I will continue to run the tests. I don't know if this works under other operating systems.
Future work
Types of changes
non-breaking change which adds functionality)Checklist:
line > 80
, but my editor doesn't think so?!)Signed-off-by
.