IAM and KMS are easily one of the most 2 important services in terms of AWS security. KMS is used extensively for cryptographic operations not only within the AWS universe itself, but also in various hybrid architectures. It is a solid service which offers a wide variety of cryptographic services and possibilities. In this article, we will compare from a security point of view, the usage of CMK's kms:encrypt
versus data-keys.
There is a lot to like about the CMK’s kms:encrypt/decrypt
functionality from a security perspective. It encrypts and decrypts data purely through the AWS KMS APIs. Access to the cryptographic APIs is managed through IAM roles/policies
, the KMS CMK’s key policy, and grants. Therefore, the actual cryptographic keys never leave the KMS service, while having full granularity on cryptographic access control. For instance, I can allow person/System A to only encrypt data from a certain IP range or VPC-endpoint, while allowing person/System B to only decrypt it after going through 2FA.
It does not stop there: no traces of cryptographic material (whether symmetric or asymmetric) are left permanently behind in logs, debug output, terminal output, or even memory. It's nice and clean.
A good real-world use-case would be encrypting/decrypting secrets stored in Parameter Store, all of which can be consumed by other AWS-services or backend systems, via assume-roles and STS tokens behind the scenes.
But this is where things will quickly hit a wall: The CMK's kms:encrypt
functionality only works for data up to a maximum of 4KB (I know, it is not a lot by any means). Hence, it is super useful for envelope encryption or encrypting small amounts of data (e.g, Credit Card numbers). However, for anything beyond 4KB of data, we will have to use what is referred to as Data-Keys (kms:Generate-Data-Key
).
Data-Keys work completely differently. Let’s say we have a backend system which needs to encrypt and decrypt large amounts of data. It will first call the kms:generate-data-key
API, which will return a CMK-generated plaintext "encryption/decryption"
data-key. That, in essence, is very similar to generating your own AES-256 symmetric key via openssl and using it to encrypt/decrypt data. The only major difference is that the data-keys are generated by the KMS service, which can always re-issue the plaintext data-key (i.e, when decryption is needed via kms:decrypt).
Here is how the API call is made as well as the output given:
$ aws kms generate-data-key --key-id zzzzzzzz-yyyy-zzzz-yyyy-zzzzzzzzzzzz --key-spec AES_256
Anonymized output:
{
"CiphertextBlob": "MADIAY54G%Tyo4r0pl.... …. …. ",
"Plaintext": "ey54rb50obR55Yujj34#24ffsq2&#$Ty0plkmr8Ze+C=",
"KeyId": "arn:aws:kms:us-west-2:000000000000:key/n78ctr2d-16f4-E4TY-7hy5-BL0DFR"
}
Now that we have received the data-key, let's explore what AWS recommends when dealing with them:
- "You must use and manage data keys outside of AWS KMS."
- "You can write your own code or use a client-side encryption library, such as the AWS Encryption SDK"
- "Use the plaintext data key (in the Plaintext field of the response) to encrypt your data outside of AWS KMS. Then erase the plaintext data key from memory."
- "Use the Decrypt operation to decrypt the encrypted data key. The operation returns a plaintext copy of the data key."
Say we decided to go with writing our own code route to manage data-keys. Realistically, here is a list of the steps needed to do handle them securely:
- Clearing the terminal from output and/or files pertaining to the following command:
$ aws kms generate-data-key
# which returns a Base64-encoded data-key
- Once we receive the output from step one, we need to Base64 decode the data-key itself. The output of that command would give us the decoded cleartext data-key (which is actually used for encryption/decryption). We therefore need to protect the output of that command, especially if the output was directly present in the terminal, such as:
$ echo 'plaintext_data_key_Base64_encoded' | base64 --decode
- Similarly to step 2, if a file was used to save the decoded plaintext data-key, we would need to remove it (after the data-key was used to encrypt data), so that would be the decoded_datakey.file below:
$ echo 'plaintext_data_key_Base64_encoded' | base64 --decode >> decoded_datakey.file
- Removing all traces of the data-key from the OS's memory, as well as any temporary cashes, logs outputs, etc
- Clearing the terminal output from the terminal screen for the following command:
$ aws kms decrypt command + the encoded_encrypted_data_key that is returned
-
Doing the same process as step 2 and 3 above, following
kms:decrypt
, in terms of terminal output and/or if a file was used to save the output of the decoded plaintext data-key -
Repeat Step 4
As we can see, this can quickly become messy if we have to encrypt and decrypt data repeatedly in a production environment which handles sensitive data. The above steps can probably be automated, but luckily we have a much better option.
The AWS encryption SDK handles a lot of these tasks cleanly and automatically for us, including key wrapping of the data-key itself and memory wiping:
“To protect your data keys, the AWS Encryption SDK encrypts them under one or more key-encryption keys known as wrapping keys or master keys. After the AWS Encryption SDK uses your plaintext data keys to encrypt your data, it removes them from memory as soon as possible. Then it stores the encrypted data keys with the encrypted data in the encrypted message that the encrypt operations return.”
For more info regarding the AWS Encryption SDK: https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/introduction.html
Data-Keys are necessary if we decide to use the AWS KMS service for the encryption/decryption of large amounts of data, for example within a hybrid cloud-environment. Another use-case would be the encryption and decryption of data between two different entities (or companies): one entity encrypts the data via data-keys, and the other entity decrypts it via calling kms:decrypt
+ the ciphertext-blob of the CMK which generated those data-keys. Both entities will use their own IAM roles and permissions independently of each others.
The most important take away of this article would be not to use custom code to handle CMK data-keys. Consider it the equivalent of how writing your own "custom AES" encryption algorithm would simply be a bad idea. If using KMS + the AWS Encryption SDK is not an option because of technological or other limitations, I would reconsider the usage of CMK data-keys via KMS for other alternative solutions, where we can ensure the security of the cryptographic keys during their entire lifecycle.
By the way, there is also a KMS API for data-keys with asymmetric encryption, namely kms:GenerateDataKeyPairs, which returns "a plaintext public key, a plaintext private key, and a copy of the private key that is encrypted under the symmetric CMK you specify". Most, if not all, of the same discussion and conclusions we discussed apply to it as well. More info on that: https://docs.aws.amazon.com/kms/latest/APIReference/API_GenerateDataKeyPair.html
reference
- https://www.linkedin.com/pulse/aws-kms-security-review-data-keys-versus-kmsencrypt-ziyad-almbasher/
- https://aws.amazon.com/ko/blogs/database/column-level-encryption-on-amazon-rds-for-sql-server/
- https://stackoverflow.com/questions/58200584/how-to-encrypt-data-in-aws-rds-with-aws-kms-on-the-column-level
- https://docs.aws.amazon.com/pdfs/kms/latest/cryptographic-details/kms-crypto-details.pdf