Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial version of encryption spi proposal #355

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tsaarni
Copy link

@tsaarni tsaarni commented Sep 18, 2024

In the earlier Vault SPI design proposal, a second phase was proposed to introduce encryption and decryption capabilities. This pull request builds upon that proposal and introduces a new SPI, called Encryption SPI, designed to protect secrets during storage in the database.

@tsaarni
Copy link
Author

tsaarni commented Sep 18, 2024

I've created a discussion thread also here: keycloak/keycloak#33045

Copy link
Contributor

@stianst stianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read it correctly the intent is to intercept specific values such as client secret, to then not store these in the Keycloak DB, but rather store these in the Encryption SPI provider?

If that's the case it might be better to extend on Vault SPI to allow write of secrets, and not just reads, rather than introduce another SPI completely. Especially considering the fact that you would need to read the secrets, which the Vault SPI already takes care of.

This can include fields like username, email, or other fields that can be considered as privacy-sensitive but need to be searchable.


### Performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much doubt this would scale well for things like client secrets and OTP secrets.

For client secrets hashing may be a better option than storing in an external vault.

Copy link
Author

@tsaarni tsaarni Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably right that the performance would not be sufficient.

I came across the discussion you've had regarding hashing and it seems like a good option as well. My concern is that it only focuses on client secrets, and using encryption might allow us to avoid the backward compatibility issues caused by hashing.

I’ve been studying Kubernetes' KMSv2 implementation for securing Secrets. They've managed to create a highly efficient solution by minimizing calls to external encryption services, though it adds complexity on the Kubernetes side rather than in the KMS side. At a high level:

  • Each secret is encrypted locally using a data encryption key (DEK).
  • A new DEK is generated locally with every secret write.
  • DEK generation is based on a key derivation algorithm that uses a seed.
  • The seed is encrypted by the remote encryption service.

I believe this approach could work for us as well, and I can present a similar strategy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashing does provide an extra layer of security in the fact that the original secret for a client can not be retrieved again through REST APIs. In general I think any secrets stored in a vault (or hashed or whatever) should not be readable through REST APIs.


> **TODO**: Q: Should Keycloak include default provider for Encryption SPI?
>
> Some possible alternatives:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an SPI without an implementation is not very elegant; it would more or less make it useless to most folks. Additionally it would be impossible to test fully, as well as do any sort of benchmarking, which really would be required for something like this.

HashiCorp is a defacto choice, but less so in the open source now that it has changed its license. So, not really sure what a default implementation (or implementations) would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your other comment we could in theory have two implementations; one that stores in DB directly as encrypted values (that should of course be able to receive its own encryption keys from a secret, or from HashiCorp, etc.); and another that stores in an external vault like HashiCorp.

* **Status**: Draft #1
* **Github**: TODO

## Motivation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this proposal come with some details on best practices, and compare with DB encryption

@tsaarni
Copy link
Author

tsaarni commented Sep 20, 2024

Thanks for your response @stianst!

If I read it correctly the intent is to intercept specific values such as client secret, to then not store these in the Keycloak DB, but rather store these in the Encryption SPI provider?

No, what I meant is that the data is stored in a database, so the storage itself remains unchanged. The encryption SPI's role would be limited to encrypting the data before storage and decrypting it after retrieval, but before use.

Ideally, this should also simplify the administrator's task by eliminating the need to manage references or deal with configuring the remote vault (e.g., defining the schema for how individual Keycloak parameters are stored in key-value secret store of HashiCorp Vault).

@stianst
Copy link
Contributor

stianst commented Sep 23, 2024

Thanks for your response @stianst!

If I read it correctly the intent is to intercept specific values such as client secret, to then not store these in the Keycloak DB, but rather store these in the Encryption SPI provider?

No, what I meant is that the data is stored in a database, so the storage itself remains unchanged. The encryption SPI's role would be limited to encrypting the data before storage and decrypting it after retrieval, but before use.

Ideally, this should also simplify the administrator's task by eliminating the need to manage references or deal with configuring the remote vault (e.g., defining the schema for how individual Keycloak parameters are stored in key-value secret store of HashiCorp Vault).

Got it. That brings an interesting problem though when rotating the encryption keys. If the encrypted data is stored in the rows/columns where the secret would otherwise be, it would be more or less impossible to find all the encrypted values in order to re-encrypt them with new keys. An alternative to that may be a provider that uses a separate table to store encrypted values, with the encrypted value and a kid. That would allow supporting a passive key and an active key like we do for token signing for instance. Then the values can be re-encrypted with the new key either lazily or in a background task, and eventually the old key can be removed when all values are updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants