Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls-crypt IV collision estimate is too optimistic #9

Open
mfil opened this issue Dec 9, 2024 · 13 comments
Open

tls-crypt IV collision estimate is too optimistic #9

mfil opened this issue Dec 9, 2024 · 13 comments

Comments

@mfil
Copy link

mfil commented Dec 9, 2024

The IVs used in encryption might collide. When two IVs collide, an attacker can learn the xor of the two plaintexts by xor-ing the ciphertexts. This is a serious loss of confidentiality. The IVs are 128-bit, so when HMAC-SHA256 is a secure PRF (an assumption that must also hold for TLS), and we use the same uniqueness requirement from [nistgcm], this limits the total amount of control channel messages for all peers in the setup to 2^48. Assuming a large setup of 2^16 (65536) clients, and a (conservative) number of 2^16 control channel packets per connection on average, this means that clients may set up 2^16 connections on average.

It's correct that if you select 2^48 128-bit IVs at random, the risk of a collision is only 2^-32. But in CTR mode, IVs that are close to each other can also leak information. Let's say we have an AES key k, IV and IV' = IV + 1. Then the key stream from IV is

K0 = AES(k, IV), K1 = AES(k, IV+1), K2 = AES(k, IV+2), ...

and from IV'

K'0 = AES(k, IV+1), K'1 = AES(k, IV+2), K'2 = AES(k, IV+3), ...

If we're encrypting messages m = m0 m1 m2 ... and m' = m'0 m'1 m'2 ... with these two IVs, then we're leaking m1 XOR m'0, m2 XOR m'1, ...

To fix the analysis, we need some upper bound on the length of control channel messages.

(Note that this kind of consideration is not relevant for AES-GCM because the IV is 96 bits, leaving the lower 32 bits for the counter.)

@schwabe
Copy link
Collaborator

schwabe commented Dec 12, 2024

Yeah, I think that is something that @syzzer overlooked (and others like me that reviewed the initial tls-crypt). We have an absolute maximum of

#define TLS_CHANNEL_BUF_SIZE 2048 for control channel messages. Real messages are shorter but I don't think using 2^10 vs 2^12 for the message length makes a big difference for the analysis.

But yeah, this collision is not great. Without the "overlapping" IV we "only" leak that two messages are identical. But with this we actually leak xor of two plaintexts. It feels like we should have also only used like 96-120 bits from the HMAC for the IV and use the last few bits for the a counter of the packets, so we could have avoided the possiblity of close enough IVs that the cipherstream overlaps.

@syzzer
Copy link
Member

syzzer commented Dec 12, 2024

Very good point, I did overlook that. Of course, the worst that can happen is that we leak some information about the TLS handshake (which would be plaintext w/o tls-crypt anyway), but still.

Since senders can freely pick their IV format without breaking compatibility, the simplest fix would probably be to force the last 8 IV bits to zero, effectively reducing the IV length to 120 bits. Or maybe even 16 bits (reducing the IV to 112 bits), to make it a multiple of 8 and leave some room for possible future increases in control channel msg size.

@schwabe
Copy link
Collaborator

schwabe commented Dec 13, 2024

I don't think the senders can freely pick their IV format. Since the IV doubles as the checksum, a receiver will throw away the packet if it doesn't match.

@mfil
Copy link
Author

mfil commented Dec 13, 2024

Can't we just redo the analysis? If a given IV is picked, it burns the interval [IV - 2^12, IV + 2^12], so we have a "collision" if the next IV lands in that interval. I don't know how annoying that calculation would get.

@mfil
Copy link
Author

mfil commented Dec 13, 2024

So we pick IV0. Then we pick IV1. The chance of a collision is 2^13 / 2^128 = 2^-115. We pick IV2. Being pessimistic and assuming that the intervals around IV0 and IV1 don't overlap, the collision chance is 2^14 / 2^128.

If we pick n IVs, the chance of not having a collision is

(1 - 2^-115) * (1 - 2 * 2^-115) * (1 - 3 * 2^-115) * ... * (1 - (n-1) * 2^-115)

So effectively we're losing 13 bits of collision resistance.

For small x, (1 - x) is approximately exp(x). So our chance of not having a collision is

exp(2^-115 * (1 + 2 + ... + n-1)) = exp(n * (n-1) * 2^-116) = 1 - n^2 / 2^116 (approximately)

If n = 2^k, the collision risk is 2^(2k - 116). If we accept a risk of at most 2^-32, we need to keep k <= 42. Of course 42 is the answer!

@syzzer
Copy link
Member

syzzer commented Dec 14, 2024

I don't think the senders can freely pick their IV format. Since the IV doubles as the checksum, a receiver will throw away the packet if it doesn't match.

Ah, you're right of course, this is the SIV construction. (Though technically part of the MAC doubles as IV.)

The tls-crypt format does not seem to leave room to signal a new format. (This seems to be from before I learned to always include some reserved bits or version field...) Though if we want to fix this, we might be able use the key file or some bits of the packet ID to signal a format where we force the last n bits of the IV to be zero. I didn't fully think this through yet.

@mfil
Copy link
Author

mfil commented Dec 14, 2024

Not sure if this is worth it to change. The 2^48 messages we're currently claiming aren't possible with a risk of at most 2^-32. Setting the last 12 bits to 0, or just leaving everything as it is both amounts to about 2^42 messages with a collision risk of 2^-32.

@syzzer
Copy link
Member

syzzer commented Dec 15, 2024

It might very well not be worth the change. Thanks for doing the analysis.

tls-crypt has been a best-effort measure from the start, with the primary objective to "do something as easily usable as tls-auth, that reasonably hides the certificate and key exchange". For larger setups, tls-crypt-v2 is available, which has much better bounds because the key is not shared amongst clients.

Even for tls-crypt I'm not too worried about the collision risk itself, I'm just disappointed that we can't claim that "the worst that can happen is that an attacker learns that two messages were identical", but instead can leak the XOR of the plaintext. Which is why I found it useful to at least explore the options to resolve that. If a simple fix was available, I'd prefer to fix it. Though it seems you are right that this is not worth the change.

@mfil
Copy link
Author

mfil commented Dec 15, 2024

I'm just disappointed that we can't claim that "the worst that can happen is that an attacker learns that two messages were identical", but instead can leak the XOR of the plaintext.

That's unavoidable if you use random IVs.

@mfil
Copy link
Author

mfil commented Dec 17, 2024

I made a few stupid mistakes that made my analysis excessively pessimistic. If I understand correctly, each message is at most 2048 bytes long. (Usually less, but it shouldn't matter much.) I've been acting as if the messages are up to 2^12 = 4096 blocks long, but that's nonsense.

2048 bytes are 128 AES-blocks. So really we have a collision if IV' is in the interval [IV - 2^7, IV + 2^7].

Doing the analysis in the same style as before, if we send n messages, we have a collision chance of n^2 / 2^120. For n = 2^k, it's 2^(2k - 120). To keep it below 2^-32, we need k <= 44.

@schwabe
Copy link
Collaborator

schwabe commented Dec 17, 2024

The question is. Do you want to make a PR to update the analysis in the RFC or should I make one based on your writing?

@mfil
Copy link
Author

mfil commented Dec 17, 2024

I was planning to make one. I would like it if someone could double-check my calculation, it's based on the Birthday problem wikipedia article.

@mfil
Copy link
Author

mfil commented Dec 18, 2024

Another thing I talked about with @syzzer yesterday, we're not gonna worry about key wear-out like we do for GCM in the data channel, right? The bounds for CTR mode are about the same.

If we do want to worry about this sort of thing, I suggest moving to chacha-poly if we ever make tls-crypt-v3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants