Constant-time EC point multiplication (Montgomery ladder) implementation #325

ChinoCribioli · 2024-09-11T19:45:24Z

Description

The mulPointScalar method is implemented with the regular square and multiply algorithm, which is prone to timing attacks due to the fact that the number of EC point additions depends on the number of 1's in the binary expression of the scalar.

With this implementation (called Montgomery Ladder) this is avoided by maintaining the number of additions depending only on the length of the binary representation of the scalar.

As documented in my implementation of the method, the algorithm works because of the following invariant: At each step, R0 will be r_0*base where r_0 is the prefix of e written in binary and R1 will be (r_0+1)*base. In other words: at iteration i of the loop, r_0's binary representation will be the first i+1 most significant bits of e. If the upcoming bit is a 0, we just have to double R0 and add R0 to R1 to maintain the invariant. If it is a 1, we have to double R0 and add 1*base (or add R1, which is the same as (r_0+1)*base), and double R1 to maintain the invariant.

Related Issue(s)

Fixes #324

Other information

Aside from the new implementation of the mulPointScalar method, I added some tests to automatically check basic behaviors such as EC point additions, EC point multiplications, and the orders of the generating/base points used in the protocol.

These tests are independent of the new implementation of the multiply method and are intended to make the test base more robust.

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
I have run yarn style without getting any errors
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

re privacy-scaling-explorations#117

Co-authored-by: Vivian Plasencia <[email protected]>

re privacy-scaling-explorations#324

artwyman

This is cool, @ChinoCribioli. Thanks for taking it on.

I feel underqualified to review the correctness of the cryptographic algorithm, and how well it meets the goal of resistance to timing attacks. Perhaps @cedoor is willing to do so, or can suggest someone to double-check here. I mostly trust the unit tests elsewhere (particularly the compatibility tests in eddsa-poseidon comparing results to circomlibjs) to ensure the results of this calculation are correct.

I do feel competent to comment on some aspects of the code, and potential micro-optimizations, so I've done that here. One thing I'm concerned about is relative performance of this algorithm compared to the old one. Can you do some testing of the relative speed of this raw operation, and/or higher-level operations which depend on it (like eddsa-poseidon sign/verify). Just timing operations in a loop and reporting results in the PR would be sufficient here. I don't think this needs an automated performance test in the test suite.

artwyman · 2024-09-12T04:25:11Z

packages/baby-jubjub/src/baby-jubjub.ts

@@ -65,25 +65,43 @@ export function addPoint(p1: Point<bigint>, p2: Point<bigint>): Point<bigint> {
 /**
 * Performs a scalar multiplication by starting from the 'base' point and 'adding'
 * it to itself 'e' times.
+ * This works given the following invariant: At each step, R0 will be r_0*base where r_0 is the prefix of e


I suggest additionally referring to the named algorithm being used here (Montgomery Ladder) perhaps with a link to a description of the algorithm (Wikipedia or some other source).

Sure! Will do.

artwyman · 2024-09-12T04:28:44Z

packages/baby-jubjub/src/baby-jubjub.ts

+        return [BigInt(0), BigInt(1)]
+    }
+    const eBits: Array<boolean> = []
+    while (!scalar.isZero(e)) {


Is the goal constant time? This loop doesn't seem constant time to me, since it'll exit early if high bits of the input are zero. As a result the length of eBits is also variable. If you want constant time, it seems like there has to be a hard-coded max bit length (254 I think, but someone should double-check me) somewhere.

If you wish, I think this loop can probably be skipped entirely with some bigint bitwise operations below. Something like this:

for (let bitMask = 1n, bitmask < 1n<<254n, bitmask <<= 1n) { if (e & bitMask != 0) { // ...

I'm not sure of the relative efficiency, but I'd guess that the bitmasks are cheaper than allocating memory for an array.

Yes this sounds correct to me. If e is 7 for example this will loop 3 times, whereas if it's like 4182498124124891489144 it will loop significantly more times.

Hardcoding an iteration count is a good approach, but we'll still have timing vulnerabilities due to the use of the javascript BigInt type which is not safe against timing attacks.

It's also unclear if the scalar.isZero function is constant time.

I agree with this point. I changed the loop with a hardcoded for.

artwyman · 2024-09-12T04:30:01Z

packages/baby-jubjub/src/baby-jubjub.ts

-    while (!scalar.isZero(rem)) {
-        if (scalar.isOdd(rem)) {
-            res = addPoint(res, exp)
+    if (scalar.isZero(e)) {


This isn't constant time if e is zero. Is that okay?

I think although the exponent is marked as secret, it may be considered public if it's set to 0. The rationale in my mind is "no private key would ever be selected as the 0 value". That said, I don't think it's a good idea to short circuit here because it breaks the comment above of e being secret in all cases.

Yes, I also thought initially that we shouldn't worry too much about the zero case since it is irrelevant cryptographically. But I agree to change this to make the method more consistent.

cedoor · 2024-09-12T12:01:57Z

Thanks @ChinoCribioli and @artwyman!

Perhaps @cedoor is willing to do so, or can suggest someone to double-check here.

I'm also not competent enough to check the logic out here. I'll ask other people to review it.

@ChinoCribioli Is this a breaking change?

chancehudson

I'm happy to see discussion about timing vulnerabilities, and a constant time ec multiplication implementation.

Unfortunately this function is still unsafe because of the use of the javascript BigInt type. Internally a BigInt is (likely) represented as the minimum number of bytes for the value. This means any operation we do here will be variable time based on the value of the BigInt.

I've tried to think of how to convert from BigInt to an alternate representation in constant time, but haven't found an approach that seems like it would be safe. Approaches i've considered:

use e.toString(2): this will return a binary string, but it will be of variable length based on the size of the value
use repeated division to extract le bytes: division is generally not a constant time operation based on the numerator. It would also be operating on a bigint which is not constant time
do e + (1n << 254n) to get a ~constant size value then operate on that and discard bits above 254: this could work, and may be the most safe approach, but it's unclear if the addition is constant time (the implementation may have to backfill the bits between the value and 254). It's also unclear if bigints are variable time based on only the number of bits in the value, or based on something more specific like the number of 1's in the value, and the total number of bits

I think it's a good idea to include this implementation though (after the requested changes) to mitigate timing attacks as much as possible.

chancehudson · 2024-09-12T16:43:00Z

packages/baby-jubjub/src/baby-jubjub.ts

-    while (!scalar.isZero(rem)) {
-        if (scalar.isOdd(rem)) {
-            res = addPoint(res, exp)
+    if (scalar.isZero(e)) {


I think although the exponent is marked as secret, it may be considered public if it's set to 0. The rationale in my mind is "no private key would ever be selected as the 0 value". That said, I don't think it's a good idea to short circuit here because it breaks the comment above of e being secret in all cases.

chancehudson · 2024-09-12T16:46:01Z

packages/baby-jubjub/src/baby-jubjub.ts

+        return [BigInt(0), BigInt(1)]
+    }
+    const eBits: Array<boolean> = []
+    while (!scalar.isZero(e)) {


Yes this sounds correct to me. If e is 7 for example this will loop 3 times, whereas if it's like 4182498124124891489144 it will loop significantly more times.

Hardcoding an iteration count is a good approach, but we'll still have timing vulnerabilities due to the use of the javascript BigInt type which is not safe against timing attacks.

It's also unclear if the scalar.isZero function is constant time.

chancehudson · 2024-09-12T16:46:36Z

packages/baby-jubjub/src/baby-jubjub.ts

+    }
+    const eBits: Array<boolean> = []
+    while (!scalar.isZero(e)) {
+        if (scalar.isOdd(e)) {


Unclear if isOdd is constant time

chancehudson · 2024-09-12T16:46:52Z

packages/baby-jubjub/src/baby-jubjub.ts

        }
+        e = scalar.shiftRight(e, BigInt(1))


A right shift on a BigInt is almost certainly not constant time.

chancehudson · 2024-09-12T16:48:00Z

packages/baby-jubjub/src/baby-jubjub.ts

+    let R0: Point<bigint> = base
+    let R1: Point<bigint> = addPoint(base, base)
+
+    for (const bit of eBits.slice(0, -1).reverse()) {


~~This slice is a no-op, i think. It may be necessary to invoke the reverse function though.~~

nvm, it returns all elements but the last one.

re privacy-scaling-explorations#324

artwyman

Comments on the latest changes. I'm still curious about relative performance of this implementation vs. the previous one.

packages/baby-jubjub/src/baby-jubjub.ts

artwyman · 2024-09-12T21:31:14Z

packages/baby-jubjub/src/baby-jubjub.ts


-    let R0: Point<bigint> = base
-    let R1: Point<bigint> = addPoint(base, base)
+    let R0: Point<bigint> = [0n, 1n]


I see there's a unit test covering multiply by zero, which is good to confirm this still behaves the same way as before without the special-case.
As a side suggestion, it might be clearer to make this a public const id like is used in the unit tests, to make it clear that [0n, 1n] is the representation of the zero/id point in this curve.

I like that. It is done.

ChinoCribioli · 2024-09-12T22:16:47Z

@chancehudson I just pushed most of the changes you requested. My only doubt now regarding any possible remaining timing vulnerability lies on the lines e %= order and if (e & mask). Mostly the latter, since I don't know if the evaluation of an if statement in an integer is implemented in constant time.

ChinoCribioli · 2024-09-12T22:35:13Z

Comments on the latest changes. I'm still curious about relative performance of this implementation vs. the previous one.

I've run the tests of the baby-jubjub and eddsa-poseidon packages (which are the main packages affected by this PR) with both implementations and these are the results:

chancehudson · 2024-09-12T22:40:45Z

on the lines e %= order and if (e & mask). Mostly the latter, since I don't know if the evaluation of an if statement in an integer is implemented in constant time.

The if statement should be fine as long as e is always the same size. This branch is specifically designed so that each codepath is equal time (this is the Montgomery ladder strategy). We just need to make sure the condition evaluation is constant time as well, which i suggested a change to try and ensure.

Again though, this is best effort. Regardless we're going to have some timing leakage from the BigInt implementation.

artwyman · 2024-09-12T23:26:52Z

I've run the tests of the baby-jubjub and eddsa-poseidon packages (which are the main packages affected by this PR) with both implementations and these are the results:

I think this needs a more direct test of the affected operations (repeated in a loop to avoid fixed overhead and caching effects) to make the differences more clear. It does look to me like the test times are uniformly slower, though, which isn't very pleasing. I submitted an issue requesting to make it faster (which looks like it may be met by a WASM implementation implemented in Rust). The utest which targets small values in particular is expected to get slower for a constant-time algorithm, but I assume the eddsa-poseidon tests are operating on more typical values.

I'm not entirely sure of the right trade-off here. Speaking for the Zupass use case, I think we probably care more about performance than timing attacks right now, but that's not a universal choice I expect to apply to everyone.

Co-authored-by: Chance <[email protected]>

chancehudson · 2024-09-13T01:00:43Z

packages/baby-jubjub/src/baby-jubjub.ts

-    while (!scalar.isZero(rem)) {
-        if (scalar.isOdd(rem)) {
-            res = addPoint(res, exp)
+    e %= order


Looking at this again, the original implementation doesn't make any assertions or modifications to this variable. This is an exponent, which is not necessarily a field element, so the reduction should be unnecessary.

I think if we remove this we have a pretty safe function. We just need to look more deeply at the performance hit and decide if it's worth it.

An exponent doesn't have to be a field element, but the curve is cyclic with its order, so a value greater than the order is redundant compared to the modular reduction of the same value (either leads to the same output). And the constant-ish time algorithm below works only if you can fix the number of bits in the input. If we reduce the modular reduction, then I think this function becomes incorrect for any value greater than the order. Modular reduction is not the same as cutting off high bits (because the order is not a power of 2).

Indeed. As Andrew says, some tests break when you remove the reduction because you incorrectly handle the case where the exponent passed as input is greater than 254 bits.

ah right because the loop count is hardcoded, i see

ChinoCribioli · 2024-09-13T02:11:57Z

I've run the tests of the baby-jubjub and eddsa-poseidon packages (which are the main packages affected by this PR) with both implementations and these are the results:

I think this needs a more direct test of the affected operations (repeated in a loop to avoid fixed overhead and caching effects) to make the differences more clear. It does look to me like the test times are uniformly slower, though, which isn't very pleasing. I submitted an issue requesting to make it faster (which looks like it may be met by a WASM implementation implemented in Rust). The utest which targets small values in particular is expected to get slower for a constant-time algorithm, but I assume the eddsa-poseidon tests are operating on more typical values.

I'm not entirely sure of the right trade-off here. Speaking for the Zupass use case, I think we probably care more about performance than timing attacks right now, but that's not a universal choice I expect to apply to everyone.

I get that for some projects performance is more important than

I've run the tests of the baby-jubjub and eddsa-poseidon packages (which are the main packages affected by this PR) with both implementations and these are the results:

I think this needs a more direct test of the affected operations (repeated in a loop to avoid fixed overhead and caching effects) to make the differences more clear. It does look to me like the test times are uniformly slower, though, which isn't very pleasing. I submitted an issue requesting to make it faster (which looks like it may be met by a WASM implementation implemented in Rust). The utest which targets small values in particular is expected to get slower for a constant-time algorithm, but I assume the eddsa-poseidon tests are operating on more typical values.

I'm not entirely sure of the right trade-off here. Speaking for the Zupass use case, I think we probably care more about performance than timing attacks right now, but that's not a universal choice I expect to apply to everyone.

I get that for some use cases performance is more important than safety, I honestly don't know which are the main projects that use this library. However, I think that any implementation of a rather complex cryptographic protocol must have a strong focus on security and, as you said, the performant variant may not be the best option for a lot of uses this library might have.

cedoor · 2024-09-16T14:35:02Z

However, I think that any implementation of a rather complex cryptographic protocol must have a strong focus on security and, as you said, the performant variant may not be the best option for a lot of uses this library might have.

I think security should have priority over performance, but I would wait for a full review by the audit team.

One solution might be to release a new major if the security benefits are substantial and continue to maintain version 1 anyway.

cedoor · 2024-10-03T16:19:54Z

One solution might be to release a new major if the security benefits are substantial and continue to maintain version 1 anyway.

@chancehudson @artwyman any thoughts?

artwyman · 2024-10-03T20:15:41Z

One solution might be to release a new major if the security benefits are substantial and continue to maintain version 1 anyway.
@chancehudson @artwyman any thoughts?

I have thoughts in two directions. First general thoughts on maintaining 2 versions:

It's workable, but definitely has a cost in maintenance burden and confusion. Either both versions need to be kept up to date with future patches, or else v1 will slowly become less useful, and potentially even insecure. I historically have a preference for "develop on main" workflows since the code everyone looks at and uses the code which will remain up-to-date and workable. I realize this is a preference, though, not a clear-cut definition of "best".

If you're going to support multiple algorithms for a significant time period, it may be better to keep them both in the code separately. They could be separate packages, separate functions within the package, or configurable via an "options" parameter. That keeps them both "on main" so they should benefit from ongoing work, unit testing, etc.

2nd, thoughts on the specifics of the security vs. performance tradeoff here:

Given bigint arithmetic isn't constant-time anyway, this algorithm isn't a complete solution to the security problem. That makes me personally de-prioritize it vs. performance, but that's a call reasonable people can make differently. I would defer to someone with more security expertise, and who has specific threat models in mind to make the call on whether a constant-time algorithm built on variable-time bigint math reduces timing attack risk by enough of a margin to be worth prioritizing.

Calling the current implementation the "performant variant" is admittedly a bit of a stretch, since any pure-JS implementation of cryptography isn't going to be very performant. A preferable alternative would be to solve this problem fully as part of a more optimized implementation (probably wasm) which is fast enough to be an improvement to performance even with a constant-time algorithm. Such an implementation could improve both performance and security at once . I'm not sure how soon that's likely to happen, though, so there may need to be trade-offs or provide multiple options in the near term.

ChinoCribioli and others added 17 commits August 28, 2024 17:08

feat(lean-imt): added updateMany method to package

130b977

re privacy-scaling-explorations#117

feat(lean-imt): implemented some tests on lean-imt

6049ea4

re privacy-scaling-explorations#117

feat(lean-imt): added more precondition checks

e00a891

re privacy-scaling-explorations#117

feat(lean-imt): finished testing on updateMany method

3e9e54b

re privacy-scaling-explorations#117

feat(lean-imt): added test to the case when passing repeated indices

cd3156e

re privacy-scaling-explorations#117

feat(lean-imt): added complexity documentation for updateMany method

06e941c

re privacy-scaling-explorations#117

feat(lean-imt): added test of several updates

4b51e44

re privacy-scaling-explorations#117

feat(lean-imt): added repeated indices check

29638d5

re privacy-scaling-explorations#117

feat(lean-imt): changed error message to be more accurate

880812c

re privacy-scaling-explorations#117

feat(lean-imt): added complexity in terms only of n

681239e

re privacy-scaling-explorations#117

feat(lean-imt): changed documentation to add discussion in another issue

1bfffd4

re privacy-scaling-explorations#117

feat(lean-imt): fixed typo on documentation

bd67b35

re privacy-scaling-explorations#117

Update packages/lean-imt/src/lean-imt.ts

c5e836d

Co-authored-by: Vivian Plasencia <[email protected]>

Merge branch 'privacy-scaling-explorations:main' into main

6270ad5

perf(baby-jubjub): implemented montgomery ladder

fe22dac

test(baby-jubjub): added more tests to curve basic operations

6665e36

docs(baby-jubjub): documented montogmery ladder

fc0f3b8

ChinoCribioli requested review from cedoor and artwyman as code owners September 11, 2024 19:45

docs(baby-jubjub): fixed documentation

8127d10

re privacy-scaling-explorations#324

artwyman reviewed Sep 12, 2024

View reviewed changes

chancehudson reviewed Sep 12, 2024

View reviewed changes

ChinoCribioli added 2 commits September 12, 2024 16:25

docs(baby-jubjub): added resource of Montgomery Ladder

3c122c5

re privacy-scaling-explorations#324

fix(baby-jubjub): replaced undetermined while with hardcoded for loop

56fcd9f

re privacy-scaling-explorations#324

artwyman reviewed Sep 12, 2024

View reviewed changes

refactor(baby-jubjub): created and exported identity point

b7dec5b

ChinoCribioli requested a review from chancehudson September 12, 2024 22:37

ChinoCribioli requested a review from artwyman September 12, 2024 22:37

Update packages/baby-jubjub/src/baby-jubjub.ts

f0ad8e1

Co-authored-by: Chance <[email protected]>

chancehudson reviewed Sep 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constant-time EC point multiplication (Montgomery ladder) implementation #325

Constant-time EC point multiplication (Montgomery ladder) implementation #325

ChinoCribioli commented Sep 11, 2024 •

edited by cedoor

Loading

artwyman left a comment •

edited

Loading

artwyman Sep 12, 2024

ChinoCribioli Sep 12, 2024

artwyman Sep 12, 2024

chancehudson Sep 12, 2024

ChinoCribioli Sep 12, 2024

artwyman Sep 12, 2024

chancehudson Sep 12, 2024

ChinoCribioli Sep 12, 2024

cedoor commented Sep 12, 2024

chancehudson left a comment •

edited

Loading

chancehudson Sep 12, 2024

chancehudson Sep 12, 2024

chancehudson Sep 12, 2024

chancehudson Sep 12, 2024

chancehudson Sep 12, 2024 •

edited

Loading

artwyman left a comment

artwyman Sep 12, 2024

ChinoCribioli Sep 12, 2024

ChinoCribioli commented Sep 12, 2024

ChinoCribioli commented Sep 12, 2024

chancehudson commented Sep 12, 2024

artwyman commented Sep 12, 2024

chancehudson Sep 13, 2024

artwyman Sep 13, 2024

ChinoCribioli Sep 13, 2024

chancehudson Sep 13, 2024

ChinoCribioli commented Sep 13, 2024

cedoor commented Sep 16, 2024

cedoor commented Oct 3, 2024

artwyman commented Oct 3, 2024

Constant-time EC point multiplication (Montgomery ladder) implementation #325

Are you sure you want to change the base?

Constant-time EC point multiplication (Montgomery ladder) implementation #325

Conversation

ChinoCribioli commented Sep 11, 2024 • edited by cedoor Loading

Description

Related Issue(s)

Other information

Checklist

artwyman left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cedoor commented Sep 12, 2024

chancehudson left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chancehudson Sep 12, 2024 • edited Loading

Choose a reason for hiding this comment

artwyman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChinoCribioli commented Sep 12, 2024

ChinoCribioli commented Sep 12, 2024

chancehudson commented Sep 12, 2024

artwyman commented Sep 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChinoCribioli commented Sep 13, 2024

cedoor commented Sep 16, 2024

cedoor commented Oct 3, 2024

artwyman commented Oct 3, 2024

ChinoCribioli commented Sep 11, 2024 •

edited by cedoor

Loading

artwyman left a comment •

edited

Loading

chancehudson left a comment •

edited

Loading

chancehudson Sep 12, 2024 •

edited

Loading