Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak counters for Yao garbled gates could improve. #41

Open
samee opened this issue Sep 12, 2017 · 8 comments
Open

Tweak counters for Yao garbled gates could improve. #41

samee opened this issue Sep 12, 2017 · 8 comments

Comments

@samee
Copy link
Owner

samee commented Sep 12, 2017

Jack Doerner brought up this issue in a related commit about OT counters:
https://github.com/jackdoerner/obliv-c/commit/2f6e99467a08e3d34c7f80a7e0873b768f347702#commitcomment-24208666

TL;DR: currently each thread uses a counter sequence from a random start offset. While this is usually enough, there are pathological cases where counter values (64-bit) could collide, causing a security problem. We should find something better.

Code reference: the variable k in yaoHalfMask2: https://github.com/samee/obliv-c/blob/obliv-c/src/ext/oblivc/obliv_bits.c#L658

@samee samee self-assigned this Sep 12, 2017
@samee
Copy link
Owner Author

samee commented Sep 12, 2017

@jackdoerner

@samee samee removed their assignment Sep 12, 2017
@jackdoerner
Copy link
Contributor

RE: Previous comment chain, for the record:

when you consider code that uses many threads over its lifetime, but each only for a short duration

That's just bad code, at least from a performance standpoint :) The standard solution to that (not just in secure programs) is to have a "thread pool" that is reused from time to time.

Yes, I certainly hope nothing I write spawns 2^32 threads. But as you observed there are also some bad security implications and I feel like we should do our best to handle those gracefully, just in case. It's worth noting that if you're not very careful, it's possible to write OpenMP code that spawns a huge number of threads over its lifetime. The correct programmer response to this is probably not to split the protocol afresh each time it happens though.

  • Have each thread use its own ypd->fixedKeyCipher, each with its own random key that both parties know about. This is probably the simplest solution.
  • Have a central thread distributing counter ranges, with larger ranges for the thread that keeps on using it up. This sounds needlessly complicated to me.

I agree with your assessment here. I'll start working.

@jackdoerner
Copy link
Contributor

My understanding of line 676 in obliv_bits.c:

for(i=0;i<sizeof(k);++i) for(j=0;j<2;++j) buf[i+j*blen]^=((k>>8*i)&0xff);

Is that you're filling a 128 bit buffer with two copies of the 64 bit gate count. The simple way would seem to me to fill the first 64 bits with a protocol count. This should be trivial to implement. Thoughts? Note that protocol counts will have to be allocated using atomics in order to avoid collisions.

@samee
Copy link
Owner Author

samee commented Sep 13, 2017

Ah, no actually. This is an example of a micro-optimization that helped, but made the code harder to read.

Yes, it is filling up the buffer with two copies of a 64-bit counter. But the buffer is not 128 bits ... it's 2*FIXED_KEY_BLOCKLEN = 256 bits. The idea was to use a single invocation of gcry_cipher_encrypt for computing two hashes, because for some reason that was providing a sharp speedup. The trick did not carry over to larger number of hashes, for reasons that was never clear to me.

In this fixedKeyCipher has been configured in ECB mode, so there is no internal counter in AES here, we have eliminated it, and are maintaining the counter outside.

@jackdoerner
Copy link
Contributor

Ahh, thanks for the clarification. My logic should still hold, yes? I assume that the higher half of the counter is always 0 as currently coded? If so, we could put the protocol ID there.

Re: gcry_cipher_encrypt and its speedup for two blocks but not more: maybe it pipelines two blocks at a time, when possible? Intel recommends pipelining calls to the AESNI instructions, but when I read the specs, it was suggestion 4 to 8 blocks simultaneously.

@samee
Copy link
Owner Author

samee commented Sep 13, 2017

It should work. How do you plan to do this? A new thread counter in ypd? And how are we allocating that counter, atomic globals? And please don't use up all the free bits for this. I am okay with capping thread counts at something reasonable, like a #define set to 256. We can always raise it higher if we ever find a use for it.

@jackdoerner
Copy link
Contributor

jackdoerner commented Sep 13, 2017

It should work. How do you plan to do this?

Pretty much just as you described, I think. A cap sounds reasonable, as long as we have a way to eject with an error if someone hits it. We could use atomic globals, but would it be better just to have the first instance of YPD initialize a thread counter, and increment it with each instance split from that one or its children? I guess the real question is, are there security implications to ending a yao protocol and starting another (for the sake of argument, we'll say you do this 2^32 times)? If yes, I guess globals make sense.

@jackdoerner
Copy link
Contributor

Here is a preliminary version. Let me know if you have thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants