KKRT PSI Implementation (#41)

* better benchmark * added generic sender/receiver testing * fixed the generator * fixed n benchmark * add NPSI testing of sender & receiver * adjust godoc to say \n for the FromReader versions * change load factor * debugged parallel npsi * documentation fixes * new version of util.Exhaust * use value struct * loadfactor 2 seems to be fine * fix typo * improve proto naming schemes * add data flow chart * fix flow chart * add ot readme * fix typo in OT readme, initial attempt in defining baseOT struct and interfaces * init files * embed BaseOt struct in NaorPinkas struct and Simplest struct * broken Send and Receive function, will use crypto/elliptic package instead of curve25519 * implemented simplest OT, channel shouldn't work yet, and need test * added tests for ot methods. * use io.ReadWriter instead of channel for Send and Receive * debug Read Write * first version of simplest completed, passed tests * close msgBus when sender encounter error * refactor * configure test to test on baseCount and on longer messages * rm random file * add ristretto implementation of OT, Simplest OT * improve tests * Add points.go as a thin layer of api for wrapping the points struct and elliptic api implements NaorPinkas baseOt with both elliptic and ristretto huge refactor of ot.go and ot_ristretto.go huge refactor of tests all tests passing. * refactor block cipher to its own file add XOR cipher: H(key, index) ^ plaintext add cipherMode in OT structs * xorcipher supports variable length messages, but it recycles the hashed keys * add benchmark for AES cipher and XOR cipher * fix typo in err messages * add time log in testing * mv ot, cuckoo to internal * IKNP OT extension (#20) * N choose 1 KKRT OT extension (#21) * fix Transpose3D test * KKRT OPRF (#22) * Define OPRF interface * Implement KKRT OPRF * Improve README and comments * KKRT PSI (#24) * parallelize kkrt ot receiver * new improved kkrt OPRF algorithm done by Justin * lower back the number of tests * improved OPRF for kkrtPSI * clean up * remove comments * improve cuckoo * lower back oprf test number * fix cuckoo test * improve oprf * cleanup * change back to use value instead of pointers for cuckoo (#27) * change back to use value instead of pointers for cuckoo * lower test numbers * buffer sender hashIds channel * use buffered channels * remove unnecessary waitgroup in stage 3 * convert cuckoo outputs to channels * static array of channels * use empty struct for map values * inplace bits operations * bump receiver size in tests * improve oprf kkrt by removing 3D transpose of matrices * fix bug in util.ExtractBytesToBits which ignores the last byte use kkrt oprf instead of the more complicated improvedKKRT * instantiate aes block and reuse * reset timer for inplaceXor benchmark * use gob to send precomputed hash maps * add dummy cuckoo to avoid allocation for sender * sender sends hashed encodings right away, receiver hash and index local encodings with corresponding ID and intersect each received hashed encodings * added mechanism to process a batch of identifiers per go routine and launch only runtime.GOMAXPROCS(0) number of gorouintes instead of a single goroutine processing a single identifier * inplace operation for PseudorandomCode * reduce PseudorandomCode allocation by 1 * remove outdated crypto encryption methods update base OT tests to reflect real use case scenario meaning number of messages is fixed to 512, and vary number of bytes per messages (this should be the same as 1.4 * number of messages) use Naor-Pinkas base OT instead of Simplest Update elliptic curve point deriveKey function, call point.x.Bytes() instead of points.Marshal() Add timing information in KKRT OPRF for better performance analysis * inplace xor id's last byte with hash idx, remove go routines from cuckoo hashtable and stage1 * add PrgWithSeed ImprovedKKRT seems to fail due to crypto/rand * reuse oprfInputs in stage3 to avoid reiterate on cuckooHashtable buckets remove expose cuckooHashTable api * remove blake2 encrypt decrypt * remove constant XORBlake2 * use AES-CTR Drbg * undo a previous commit that broke ImprovedKKRT with the incoming BitVec implementation, we should no longer need the extract 1 byte to 8 bytes, each byte containing 1 bit of information anymore * implement proper pseudorandom generators update all ot/oprf/ot-extension that uses prg remove the hacky PseudorandomGeneratorWithBlake3 which just uses Blake3.Read() * update new cuckoo benchmark in readme which has 2 times speed up in insert * cleanup * rename crypto to cipher * buffer reader/writer in stage3 for send and receive hashes * buffer map * KKRT BitVect (#36) * Untested blockwise transpose * Start testing of 512x512 transpose * Fix 512x512 transpose * Add ability to pad blocks and combine functions for tall and wide matrices * Unravel a 2D matrix into 512x512 padded bit blocks * URaveling and unraveling methods rough but not fully working * Refactor Unraveling methods * Fix raveling and unraveling methods. * Benchmark transpose * Crude first working concurrent transpose implementation * Don't export unnecessary functions * Move random matrix generation to bits.go * Refactor and improve performance on transpose * Add variables for testing * Slice and matrix conversions for Byte to Uint64 along with XOR and AND * Update naming of conversion functions * Try to apply BitVect-based transpose to KKRT * Revert "Try to apply BitVect-based transpose to KKRT" This reverts commit 37bdd60. * lower test case for transpose to pass pipeline * Remove padding support to BitVect * SBitVect little endian implementation start * Wrapper function around convert and transpose and convert tests * Debugging baseOt * use the correct choice bit * kkrt runs but fails * read bits in little endian way * happy path working, but we really need to figure out what is going on with the secret choice bits * test with sending only T = 0, d = 1 matrix * Convert transpose to work Little Endian * Begin partially unrolling of uint64 portion of transpose * KKRT with BitVect transpose * Improved KKRT uses BitVect transpose * Fix improvedKKRT oprf fix tests bitvec version of KKRT working * Address comments on PR #36 * Address comments from PR #36 * Set k constant to 512 bits in OPRF Co-authored-by: Justin <[email protected]> * Modify concurrent transpose to use a number of goroutines equal to number of cores * remove unnecessary OTs remove ExtractByteToBits Convert all remaining OT/OT-extension to use densely packed bytes * move points.go from internal/ot to internal/crypto * use uncompressed marshal and unmarshal for points since it's faster fix all OT/Ot-extension to use only padTil512 fix all tests * remove findK since no longer needed * remove old proto * Apply suggestions from code review Co-authored-by: Xavier Capaldi <[email protected]> * Apply suggestions from code review Co-authored-by: Xavier Capaldi <[email protected]> * preallocate aesBlock * simply allocation in bpsi * KKRT concurrent bit operations (#39) * Cleanup XOR operation and write concurrent version * Add concurrent in place AND operations and clean up * avoid append after cuckoo hashing * Clean up KKRT code (#40) * clean up cuckoo * add clean util/bit * remove pad function since it's not needed elsewhere * Cleanup docstring * add comments to prg remove unused functions in util/bits * changed PseudorandomCode to use all bytes of input, instead of always the first 15 bytes * add comments for points * combine baseOT test and OT test * hide New point interface * unexpose points in baseOT * cleanup ot * Cleanup bit and bitvect * change how oprf keys are stored and used * change K to pointK * remove err in ristretto * move deriveKeyRistretto to crypto/points * move ristretto point related functions to crypto/points * refactor points and ristretto points * minor fix up Co-authored-by: Xavier Capaldi <[email protected]> * copy to a preallocated slice instead of appending * Revert "avoid append after cuckoo hashing" This reverts commit 17749f6. * hide oprfinput details * change batch size to be correlated to number of cores each batch contains 42 * 768 bits of inputs * revert back to constant * update README Correct typo in README and update/simplify diagram * update diagram in readme * Add printing of memory allocations in KKRT test * Improve memory and time reporting * Output memory information to stderr * Move memory logging * Print memory info in MiB rather than MB * Remove append while creating OPRF input * Remove tmp slice for PseudorandomCode * Cuckoo hash index now stored as first element in front of item * Remove padding and encryption of dummy pseudorandom code * Reuse temporary slice in BitVect transpose * Concurrent unsafe casting between byte matrix and uint64 matrix * Clean up casting so default is byte to uint64 cast * Revert densely encoded cuckoo which is still buggy * Revert densely encoded cuckoo which is still buggy * Cleanup unnecessary utility functions for conversion between bytes and uint64s * Fix PseudorandomCode so it works with 16 byte input * add timing of cuckoo insert * Reduce memory usage and improve performance, focusing on stage 2 (#43) * Stop appending hash index to value * Test hashing functions (SHA256, FNV1a) * Store identifiers and bucket lookup indices in cuckoo struct * Change signature of OPRF to receive Cuckoo directly * Cleanup * Pass maps instead of arrays * Fix PseudorandomCode so it works with 16 byte input * Append hash index again right before inputt to PseudorandomCode * Includes the hash indices in the Cuckoo struct * Various optimizations to improve performance and reduce memory in stage 2 * Start cleanup * minor clean up in oprf.KKRT and oprf.ImprovedKKRT fix oprf tests * fix PSI tests * remove print statements and close err channel * Use unsafe casting to improve performance of bitwise operations * Clean up bit operations and remove AndByte * Add error checking * Apply suggestions from code review Co-authored-by: Justin Li <[email protected]> * Corrections from PR review * Small fixes for PR Co-authored-by: Justin <[email protected]> * Address suggestions from review in cipher.go * Add CipherMode type as suggested in review (#44) * Add CipherMode type as suggested in review * Add undefined default value for CipherMode type * KKRT Cleanup 1 (#45) * Incremental cleanup 1 * Cleanup bit utilities * Kkrt cleanup 2 (#46) * Incremental cleanup 1 * Cleanup bit utilities * Delete unused function * KKRT updated benchmarks (#47) * Update heatmaps for previous benchmarks * Resize plots * Update plots * Fix plot * Update plots * Add memory and GC plots * Update plots again * Update plots x3 * Add last data * Split detailed KKRT benchmark into its own file * Grammar and reduce height of scatter plots * Add benchmark for varying system threads * Add Bosko's description of thread results * remove unused OT, and clarify readme (#48) * rm ot, and clarify readme * remove KKRT oprf as well * reflect new changes to the OPRF readme * fix golangci-lint errors * Apply suggestions from code review Co-authored-by: Xavier Capaldi <[email protected]> Co-authored-by: Xavier Capaldi <[email protected]> * Add comment that only tested on AMD64 as well as improved concurrent bit op functions * Fix collision issues for items shorter than 64 bytes and issue with bit operation * Remove unnecessary copy after unsafe cast * test PSI with different length inputs (#49) * test different size * test the right protocols * test 8 bytes input as well * test prints the actual number of bytes being matched (including the 2 bytes from prefix) * KKRT Proper Pseudorandom Encode and Precompute Hash in Stage 1 (#50) * No longer send number of OPRFs since it can easily be calculated locally by sender * Remove allocation for number of OPRFs * Precompute pseudorandom ids in stage 1 of sender * Merge upstream changes * Cleanup and write docstrings * Fix typo * Make more concise * Update pdocstring * Update docstring again * Fix benchmark * Use GOMAXPROCS(0) rather than NumCPU() * Split transpose into different functions for wide and tall * Change pseudorandom code to handle 8 byte output from hash function * Fully convert to use xxhash * Bug fix * Use OneOfOne implementation of xxhash * Update docstrings * benchamrk other hash functions * add xxh3 as a new hasher, but it's not as fast as highwayhash for hashing to uint64 * add second murmur3 golang package * Perform OPRF encode in-place * Test unsafe casting for output of Murmur * Use TWMB Murmur3 for hashing in PseudorandomCode * OPRF encode in-place * Fix typo * Use hash index as both seeds and update docstring * Remove duplicate * Cleanup * Remove useless bit tests * Remove all unused hash functions * Removed unused functions in PRG * Cleanup * Tidy go mod * Cleanup and add tests and benchmarks for bit utils * Panic on error in encode and hash Co-authored-by: Justin <[email protected]> * Move unused testing functions into bitvect test file * Update benchmarks * KKRT Purge (#51) * Naor-Pinkas only base OT * XOR cipher with Blake3 is only cipher mode * Remove unnecessary function * Use P256 as only elliptic curve * Remove other curve (P256) * Remove ristretto points * Address suggestions in code review #1 * Address review comments #2 * Address review comments #3 * Corrent endianness of PseudorandomCode * Update Readme to specify that it is only compatible with x86-64 * Address review comments #5 * Adjust tests * Remove Encrypt and Decrypt functions and instead just use XorCipherWithBlake3 * refactor internal/crypto/point.go and its tests * define Equal for points * Add static tests for PseudorandomCode and Encryption/Decryption * Add encryption followed by decryption test * add kkrt psi description in main REAME * Replace all usage of math/rand with crypto/rand * Clean up tests * use logger in kkrt (#53) * Add log to PSI stages (#42) * add logging * /s/Finish/Finished/g * remove testing logs Co-authored-by: Xavier Capaldi <[email protected]> * add log to indicate verbosity will default to 0 with values outside of [0, 2] * add exitOnErr function * remove logr package, embed logger in sender/receiver struct * do not embed logger in sender and receiver update README * fix newline * reflect changes to README Co-authored-by: Xavier Capaldi <[email protected]> * add logging to kkrt report memory stats with log.V(2) * format memory * fetch logger with FromContextOrDiscard * rm comments * Update README.md * Update examples/receiver/main.go * Update pkg/bpsi/sender.go Co-authored-by: Xavier Capaldi <[email protected]> * KKRT Cuckoo Update (#54) * Check errors and use a CuckooHasher instead of a DummyCuckoo * Track item index in Cuckoo and avoid Cuckoo channels * Apply suggestions from code review Co-authored-by: Justin Li <[email protected]> * Address comments from review Co-authored-by: Justin Li <[email protected]> * fix pipeline * Fix cuckoo benchmarks * Remove colon from log * Make logging of memory and gc calls consistent for benchmark parsing * KKRT Metro Hash (#56) * Test Metro Hash * Test another Metro Hash implementation * Test City Hash from Google * Swap Highway Hash for Metro Hash * Update docstring * Specify that 'm' stands for 'million' in the benchmarks * address stylistic comments * New metro hash behavior * use go-metro hasher * remove oprf interface, and rename improvedKKRT to simply OPRF * address OT comments * move declaration of overwritable variables inside loops in NaorPinkas OT * annotate all NaorPinkas errors * add IsBitSet and BitExtract helper methods * share format function calls between examples/sender and examples/receiver * remove error from getBlake3Hash since blake3 does not return errors * Remove references to improved KKRT * Rename struct to indicate it holds the encoded input * Remove testing of unsafeslice conversions * Transpose determines number of workers internally * Number workers internal to wide transpose function * Divide blocks among workers in tranpose * Pass BitVect blocks via pointers to reuse The unraveling and reraveling functions should now operate on BitVect pointers. BitVects can be instantiated once per worker in the transpose method and then reused for all blocks. This improves performance. * address comments on oprf * remove error handling in go routine, and panic instead * move SampleRandomOTMessages to oprf * simplify testing with net.Pipe() fix ot bugs remove error from instantiating a new OT or a new OPRF * Pass OPRF encoded inputs with hasher via struct Rather than passing each encoded input into a buffered channel and the hasher via it's own channel, we combine them into a new struct. The struct contains the full slice of OPRF encoded inputs along with the hasher. Rather than using an error channel, we panic as the only possible error is due to programmer error in generating the seed for hashing or AES encoding. * Simplify concurrency in encodeAndHash Remove complex job structure and replicate goroutine model used in the rest of the library for transpose and bit operations. * KKRT Property-based testing for bit operations (#58) * Use property-based tests on bit operations For all bit operations, test that the fast or concurrent versions return identical results to the naive implementation. For the fundamental operations (AND and XOR), test their properties as well. XOR: - Commutative A ^ B = B ^ A - Associative A ^ (B ^ C) = (A ^ B) ^ C - Identity A ^ 0 = A - Self-inverse A ^ A = 0 AND: - Annulment A & 0 = 0 - Commutative A & B = B & A - Associative A & (B & C) = (A & B) & C - Identity A & 1 = A - Idempotent A & A = A * Remove single-threaded transpose benchmark * KKRT panic on bit errors (#59) * Panic for non-equal length input to bit ops In the context of our use, the input to bit operations should always have equal length. Instead of returning an error, we will panic. * NewCuckooHasher panics instead of returning error * Avoid panic by initialize AES outside goroutine * Panic when try to get item at index greater than number of items * Simplify EncodesRead so it can be used in loop * Use const in tests * Update comments in OPRF * sequential key generation and encode on sender side, and key generation and decoding on receiver side * Consolidate Pad and padBitMap into a single function * Range over inputs rather than channel * give more descriptive names to oprf sender * Begin amortization in stage 3 * Remove unnecessary error checking in KKRTPSI * Use ErrGroup to handle err in ParallelEncodeAndHash * Remove comment * Amortize by batching encode and hash * Update comments and small details * /s/Keys/Key/g * /s/msgLen/msgLens/g /s/baseMsgLen/baseMsgLens/g /s/sk/secretKey in oprf * panic on wrong input for cuckoo.GetBucket /s/oprfEncoding/oprfEncodings/g * /s/EncodesRead/EncodingsRead/g /s/EncodesWrite/EncodingsWrite/g * add PadBitMap * cosmetic changes in kkrt sender * remove polymorphic New in hasher * return Hasher * instantiate AESBlock outside of goroutine to avoid panic * receiver deduplicate intersected items * reworked sender batch encodeAndHash utility * Clean up encode and hashing in stage 3 sender * Pass inputToOprfEncode by pointer * Small grammar * move for loop in goroutine * Address comments on PR * fix merge artifacts * Remove comment and add two benchmark plots * Clean up comments * Updated benchmarks * update comments * Fix axis labels * Golint Co-authored-by: Dominic Gregoire <[email protected]> Co-authored-by: Xavier Capaldi <[email protected]> Co-authored-by: Xavier Capaldi <[email protected]>
Optable · Dec 7, 2021 · b236d33 · b236d33
1 parent 545c3de
commit b236d33
Show file tree

Hide file tree

Showing 73 changed files with 4,141 additions and 448 deletions.
diff --git a/README.md b/README.md
@@ -1,30 +1,35 @@
 # match
 [![CircleCI](https://circleci.com/gh/Optable/match/tree/main.svg?style=svg)](https://circleci.com/gh/Optable/match/tree/main)
+[![Go Report Card](https://goreportcard.com/badge/github.com/optable/match)](https://goreportcard.com/report/github.com/optable/match)
 [![GoDoc](https://godoc.org/github.com/optable/match?status.svg)](https://godoc.org/github.com/optable/match)
 
-An open-source set intersection protocols library written in golang.
+An open-source set intersection protocols library written in golang. Currently only compatible with **x86-64**.
 
 The goal of the match library is to provide production level implementations of various set intersection protocols. Protocols will typically tradeoff security for performance. For example, a private set intersection (PSI) protocol provides cryptographic guarantees to participants concerning their private and non-intersecting data records, and is suitable for scenarios where participants trust each other to be honest in adhering to the protocol, but still want to protect their private data while performing the intersection operation.
 
-The standard match operation under consideration involves a *sender* and a *receiver*. The sender performs an intersection match with a receiver, such that the receiver learns the result of the intersection, and the sender learns nothing. Protocols such as PSI allow the sender and the receiver to protect, to varying degrees of security guarantees and without a trusted third-party, the private data records that are used as inputs in performing the intersection match.
+The standard match operation under consideration involves a *sender* and a *receiver*. The sender performs an intersection match with a receiver, such that the receiver learns the result of the intersection, and the sender learns nothing. Protocols such as PSI allow the sender and receiver to protect, to varying degrees of security guarantees and without a trusted third-party, the private data records that are used as inputs in performing the intersection match.
 
 The protocols that are currently provided by the match library are listed below, along with an overview of their characteristics.
 
 ## dhpsi
 
-Diffie-Hellman based PSI (DH-based PSI) is an implementation of private set intersection. It provides strong protections to participants regarding their non-intersecting data records. See documentation [here](pkg/dhpsi/README.md).
+Diffie-Hellman based PSI (DH-based PSI) is an implementation of private set intersection. It provides strong protections to participants regarding their non-intersecting data records. Documentation located [here](pkg/dhpsi/README.md).
 
 ## npsi
 
-The naive, [highway hash](https://github.com/google/highwayhash) based PSI: an *insecure* but fast solution for PSI. Documentation located [here](pkg/npsi/README.md).
+The naive, [MetroHash](http://www.jandrewrogers.com/2015/05/27/metrohash/) based PSI: an *insecure* but fast solution for PSI. Documentation located [here](pkg/npsi/README.md).
 
 ## bpsi
 
-The [bloomfilter](https://en.wikipedia.org/wiki/Bloom_filter) based PSI: an *insecure* but fast with lower communication overhead than [npsi](pkg/npsi/README.md) solution for PSI. Take a look [here](pkg/bpsi/README.md) to consult the documentation.
+The [bloomfilter](https://en.wikipedia.org/wiki/Bloom_filter) based PSI: an *insecure* but fast with lower communication overhead than [npsi](pkg/npsi/README.md) solution for PSI. Documentation located [here](pkg/bpsi/README.md).
+
+## kkrtpsi
+
+Similar to the dhpsi protocol, the KKRT PSI, also known as the Batched-OPRF PSI, is a semi-honest secure PSI protocol that has significantly less computation cost, but requires more network communication. An extensive description of the protocol is available [here](pkg/kkrtpsi/README.md).
 
 ## logging
 
-[logr](https://github.com/go-logr/logr) is used internally for logging, which accepts a `logr.Logger` object. See the [documentation](https://github.com/go-logr/logr#implementations-non-exhaustive) on `logr` for various concrete implementation of logging api. Example implementation of match sender and receiver uses [stdr](https://github.com/go-logr/stdr) which logs to `os.Stderr`.
+[logr](https://github.com/go-logr/logr) is used internally for logging, which accepts a `logr.Logger` object. See the [documentation](https://github.com/go-logr/logr#implementations-non-exhaustive) on `logr` for various concrete implementations of logging api. Example implementation of match sender and receiver uses [stdr](https://github.com/go-logr/stdr) which logs to `os.Stderr`.
 
 ### pass logger to sender or receiver
 To pass a logger to a sender or a receiver, create a new context with the parent context and `logr.Logger` object as follows
@@ -71,7 +76,7 @@ $go run examples/sender/main.go -proto dhpsi -v 1
 
 # testing
 
-A complete test suite for all PSIs is present [here](test/psi). Don't hesitate to take a look and help us improve the quality of the testing by reporting problems and observations!
+A complete test suite for all PSIs is present [here](test/psi). Don't hesitate to take a look and help us improve the quality of the testing by reporting problems and observations! The PSIs have only been tested on **x86-64**.
 
 # benchmarks
 

diff --git a/benchmark/KKRT.md b/benchmark/KKRT.md
@@ -0,0 +1,32 @@
+# KKRT Benchmarks
+
+## Runtime with varying system threads
+This heatmap compares runtimes when the sender and receiver have been limited to a set number of system threads (on an n2-standard-64 VM). Both sender and receiver have 100m (million) records with an intersection size of 50m. The receiver's datasets are represented row-wise while the sender's datasets are represented column-wise.
+
+<p align="center">
+ <img src="heatmap_kkrt_procs.png"/>
+</p>
+
+As shown in the above performance results where the number of system threads is increased, there is an up to 15% improvement in performance from the sender’s perspective, but very little effect on the receiver. Additionally, as the number of system threads is increased beyond approximately 8, there is a slight *degradation* in performance. Since KKRT does not benefit much from multi-thread parallelism, we recommend sizing your hardware primarily according to the memory requirements (see below).
+
+## Memory
+These heatmaps compare memory usage when sender and receiver use the same type of VM (n2-standard-64) but have differing number of records (50m, 100m, 200m, 300m, 400m and 500m). The receiver's datasets are represented row-wise while the sender's datasets are represented column-wise. All match attempts performed have an intersection size of 50m. 
+
+<p align="center">
+ <img src="heatmap_kkrt_sen_mem.png"/>
+</p>
+
+<p align="center">
+ <img src="heatmap_kkrt_rec_mem.png"/>
+</p>
+
+## GC calls
+These heatmaps compare number of garbage collector calls when sender and receiver use the same type of VM (n2-standard-64) but have differing number of records (50m, 100m, 200m, 300m, 400m and 500m). The receiver's datasets are represented row-wise while the sender's datasets are represented column-wise. All match attempts performed have an intersection size of 50m.
+
+<p align="center">
+ <img src="heatmap_kkrt_sen_gc.png"/>
+</p>
+
+<p align="center">
+ <img src="heatmap_kkrt_rec_gc.png"/>
+</p>
diff --git a/benchmark/README.md b/benchmark/README.md
@@ -1,16 +1,12 @@
 # Benchmarks
 
-The following scatter plots show the results of benchmarking match attempts using different PSI algorithms on Google Cloud n2-standard-64 [general-purpose virtual machines (VMs)](https://cloud.google.com/compute/docs/general-purpose-machines#n2-standard). For each benchmark, the sender and the receiver use same type of VMs. In the first plot, the receiver has 100m records while the sender has varying datasets of 50m, 100m, 200m, 300m, 400m and 500m records where as in the second plot, the sender's dataset is of 100m records while the receiver has varying datasets of 50m, 100m, 200m, 300m, 400m and 500m records. The BPSI used for these experiments has a false positive rate fixed at 1e-6. Also, all the match attempts performed have an intersection size of 50m.
+The following scatter plot shows the results of benchmarking match attempts using different PSI algorithms on Google Cloud n2-standard-64 [general-purpose virtual machines (VMs)](https://cloud.google.com/compute/docs/general-purpose-machines#n2_machines). For each benchmark, the sender and the receiver use the same type of VM. The plot shows runtime for various PSI algorithms when the sender and receiver have an equal number of records. The BPSI used for these experiments has a false positive rate fixed at 1e-6. All the match attempts performed have an intersection size of 50m (million). [Detailed benchmarks of the KKRT protocol can be found here](KKRT.md).
 
 <p align="center">
- <img src="scatter_fixed_receiver.png"/>
+ <img src="scatter_equal_sets.png"/>
 </p>
 
-<p align="center">
- <img src="scatter_plot_sender_fixed.png"/>
-</p>
-
-The results for match attempts using different PSI algorithms are provided below. Both sender and receiver used n2-standard-64 VMs with datasets containing 50m, 100m, 200m, 300m, 400m and 500m records. The receiver's datasets are represented row-wise while the sender's datasets are represented column-wise.
+The runtimes for match attempts using different PSI algorithms are provided below. Both sender and receiver used n2-standard-64 VMs with datasets containing 50m, 100m, 200m, 300m, 400m and 500m records. The receiver's datasets are represented row-wise while the sender's datasets are represented column-wise.
 
 <p align="center">
  <img src="heatmap_bpsi.png"/>
@@ -21,5 +17,9 @@ The results for match attempts using different PSI algorithms are provided below
 </p>
 
 <p align="center">
- <img src="heatmap_dhpsi.png"/>
+ <img src="heatmap_kkrt.png"/>
 </p>
+
+<p align="center">
+ <img src="heatmap_dhpsi.png"/>
+</p>
diff --git a/benchmark/heatmap_bpsi.png b/benchmark/heatmap_bpsi.png
diff --git a/benchmark/heatmap_dhpsi.png b/benchmark/heatmap_dhpsi.png
diff --git a/benchmark/heatmap_kkrt.png b/benchmark/heatmap_kkrt.png
diff --git a/benchmark/heatmap_kkrt_procs.png b/benchmark/heatmap_kkrt_procs.png
diff --git a/benchmark/heatmap_kkrt_rec_gc.png b/benchmark/heatmap_kkrt_rec_gc.png
diff --git a/benchmark/heatmap_kkrt_rec_mem.png b/benchmark/heatmap_kkrt_rec_mem.png
diff --git a/benchmark/heatmap_kkrt_sen_gc.png b/benchmark/heatmap_kkrt_sen_gc.png
diff --git a/benchmark/heatmap_kkrt_sen_mem.png b/benchmark/heatmap_kkrt_sen_mem.png
diff --git a/benchmark/heatmap_npsi.png b/benchmark/heatmap_npsi.png
diff --git a/benchmark/scatter_equal_sets.png b/benchmark/scatter_equal_sets.png
diff --git a/benchmark/scatter_fixed_receiver.png b/benchmark/scatter_fixed_receiver.png
diff --git a/benchmark/scatter_plot_sender_fixed.png b/benchmark/scatter_plot_sender_fixed.png
diff --git a/examples/README.md b/examples/README.md
@@ -2,12 +2,12 @@
 
 The standard match operation involves a *sender* and a *receiver*. The sender performs an intersection match with a receiver, such that the receiver learns the result of the intersection, and the sender learns nothing. Protocols such as PSI allow the sender and receiver to protect, to varying degrees of security guarantees and without a trusted third-party, private data records that are used as inputs in performing the intersection match.
 
-The examples support dhpsi, npsi and bpsi: the protocol can be selected with the *-proto* argument. Note that *npsi* is the default.
+The examples support kkrt, dhpsi, npsi and bpsi: the protocol can be selected with the *-proto* argument. Note that *npsi* is the default.
 
 ## 1. generate some data
 `go run generate.go`
 
-This will create two files, `sender-ids.txt` and `receiver-ids.txt` with 100 *IDs* in common between them. You can confirm the communality by running:
+This will create two files, `sender-ids.txt` and `receiver-ids.txt` with 100 *IDs* in common between them. You can confirm the commonality by running:
 
 `comm -12 <(sort sender-ids.txt) <(sort receiver-ids.txt) | wc -l`
 

diff --git a/examples/format/format.go b/examples/format/format.go
@@ -0,0 +1,49 @@
+package format
+
+import (
+ "math"
+ "os"
+ "runtime"
+
+ "github.com/go-logr/logr"
+ "github.com/go-logr/stdr"
+)
+
+// GetLogger returns a stdr.Logger that implements the logr.Logger interface
+// and sets the verbosity of the returned logger.
+// set v to 0 for info level messages,
+// 1 for debug messages and 2 for trace level message.
+// any other verbosity level will default to 0.
+func GetLogger(v int) logr.Logger {
+ logger := stdr.New(nil)
+ // bound check
+ if v > 2 || v < 0 {
+ v = 0
+ logger.Info("Invalid verbosity, setting logger to display info level messages only.")
+ }
+ stdr.SetVerbosity(v)
+
+ return logger
+}
+
+// ShowUsageAndExit displays the usage message to stdout and exit
+func ShowUsageAndExit(usage func(), exitcode int) {
+ usage()
+ os.Exit(exitcode)
+}
+
+// MemUsageToStdErr logs the total PSI memory usage, and garbage collector calls
+func MemUsageToStdErr(logger logr.Logger) {
+ var m runtime.MemStats
+ runtime.ReadMemStats(&m) // https://cs.opensource.google/go/go/+/go1.17.1:src/runtime/mstats.go;l=107
+ logger.V(1).Info("Final stats", "total memory (GiB)", math.Round(float64(m.Sys)*100/(1024*1024*1024))/100)
+ logger.V(1).Info("Final stats", "garbage collector calls", m.NumGC)
+}
+
+// ExitOnErr logs the error and exit if error is not nil
+func ExitOnErr(logger logr.Logger, err error, msg string) {
+ if err != nil {
+ logger.Error(err, msg)
+ os.Exit(1)
+ }
+}
diff --git a/examples/generate.go b/examples/generate.go
@@ -21,7 +21,7 @@ func main() {
  var ws sync.WaitGroup
  fmt.Printf("generating %d sender(s) and %d receiver(s) IDs with %d in common\r\n", senderCardinality, receiverCardinality, commonCardinality)
  // make the common part
- common := emails.Common(commonCardinality)
+ common := emails.Common(commonCardinality, emails.HashLen)
  // do advertisers & publishers in parallel
  ws.Add(2)
  go output(senderFileName, common, senderCardinality-commonCardinality, &ws)
@@ -34,7 +34,7 @@ func output(filename string, common []byte, n int, ws *sync.WaitGroup) {
  if f, err := os.Create(filename); err == nil {
  defer f.Close()
  // exhaust out
- for matchable := range emails.Mix(common, n) {
+ for matchable := range emails.Mix(common, n, emails.HashLen) {
  // add \n
  out := append(matchable, "\n"...)
  // and write it

diff --git a/examples/receiver/main.go b/examples/receiver/main.go
@@ -11,7 +11,7 @@ import (
  "time"
 
  "github.com/go-logr/logr"
- "github.com/go-logr/stdr"
+ "github.com/optable/match/examples/format"
  "github.com/optable/match/internal/util"
  "github.com/optable/match/pkg/psi"
 )
@@ -28,40 +28,11 @@ func usage() {
  flag.PrintDefaults()
 }
 
-func showUsageAndExit(exitcode int) {
- usage()
- os.Exit(exitcode)
-}
-
-func exitOnErr(logger logr.Logger, err error, msg string) {
- if err != nil {
- logger.Error(err, msg)
- os.Exit(1)
- }
-}
-
-// getLogger returns a stdr.Logger that implements the logr.Logger interface
-// and sets the verbosity of the returned logger.
-// set v to 0 for info level messages,
-// 1 for debug messages and 2 for trace level message.
-// any other verbosity level will default to 0.
-func getLogger(v int) logr.Logger {
- logger := stdr.New(nil)
- // bound check
- if v > 2 || v < 0 {
- v = 0
- logger.Info("Invalid verbosity, setting logger to display info level messages only.")
- }
- stdr.SetVerbosity(v)
-
- return logger
-}
-
 var out *string
 
 func main() {
  var wg sync.WaitGroup
- var protocol = flag.String("proto", defaultProtocol, "the psi protocol (dhpsi,npsi)")
+ var protocol = flag.String("proto", defaultProtocol, "the psi protocol (bpsi,npsi,dhpsi,kkrt)")
  var port = flag.String("p", defaultPort, "The receiver port")
  var file = flag.String("in", defaultSenderFileName, "A list of IDs terminated with a newline")
  out = flag.String("out", defaultCommonFileName, "A list of IDs that intersect between the receiver and the sender")
@@ -74,7 +45,7 @@ func main() {
  flag.Parse()
 
  if *showHelp {
- showUsageAndExit(0)
+ format.ShowUsageAndExit(usage, 0)
  }
 
  // validate protocol
@@ -86,48 +57,49 @@ func main() {
  psiType = psi.ProtocolNPSI
  case "dhpsi":
  psiType = psi.ProtocolDHPSI
+ case "kkrt":
+ psiType = psi.ProtocolKKRTPSI
  default:
  psiType = psi.ProtocolUnsupported
  }
 
  log.Printf("operating with protocol %s", psiType)
  // fetch stdr logger
- mlog := getLogger(*verbose)
+ mlog := format.GetLogger(*verbose)
 
  // open file
  f, err := os.Open(*file)
- exitOnErr(mlog, err, "failed to open file")
+ format.ExitOnErr(mlog, err, "failed to open file")
  defer f.Close()
 
  // count lines
  log.Printf("counting lines in %s", *file)
  t := time.Now()
  n, err := util.Count(f)
- exitOnErr(mlog, err, "failed to count")
+ format.ExitOnErr(mlog, err, "failed to count")
  log.Printf("that took %v", time.Since(t))
  log.Printf("operating on %s with %d IDs", *file, n)
 
  // get a listener
  l, err := net.Listen("tcp", *port)
- exitOnErr(mlog, err, "failed to listen on tcp port")
+ format.ExitOnErr(mlog, err, "failed to listen on tcp port")
  log.Printf("receiver listening on %s", *port)
  for {
  if c, err := l.Accept(); err != nil {
- exitOnErr(mlog, err, "failed to accept incoming connection")
+ format.ExitOnErr(mlog, err, "failed to accept incoming connection")
  } else {
  log.Printf("handling sender %s", c.RemoteAddr())
  f, err := os.Open(*file)
- exitOnErr(mlog, err, "failed to open file")
+ format.ExitOnErr(mlog, err, "failed to open file")
  // enable nagle
  switch v := c.(type) {
- // enable nagle
  case *net.TCPConn:
  v.SetNoDelay(false)
  }
- // make the receiver
 
+ // make the receiver
  receiver, err := psi.NewReceiver(psiType, c)
- exitOnErr(mlog, err, "failed to create receiver")
+ format.ExitOnErr(mlog, err, "failed to create receiver")
  // and hand it off
  wg.Add(1)
  go func() {
@@ -148,22 +120,24 @@ func main() {
 func handle(r psi.Receiver, n int64, f io.ReadCloser, ctx context.Context) {
  defer f.Close()
  ids := util.Exhaust(n, f)
- logger, _ := logr.FromContext(ctx)
+ logger := logr.FromContextOrDiscard(ctx)
  if i, err := r.Intersect(ctx, n, ids); err != nil {
- log.Printf("intersect failed (%d): %v", len(i), err)
+ format.ExitOnErr(logger, err, "intersect failed")
  } else {
+ // write memory usage to stderr
+ format.MemUsageToStdErr(logger)
  // write out to common-ids.txt
  log.Printf("intersected %d IDs, writing out to %s", len(i), *out)
  if f, err := os.Create(*out); err == nil {
  defer f.Close()
  for _, id := range i {
  // and write it
  if _, err := f.Write(append(id, "\n"...)); err != nil {
- exitOnErr(logger, err, "failed to write intersected ID to file")
+ format.ExitOnErr(logger, err, "failed to write intersected ID to file")
  }
  }
  } else {
- exitOnErr(logger, err, "failed to perform PSI")
+ format.ExitOnErr(logger, err, "failed to perform PSI")
  }
  }
 }