Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EC] Unify point addition for P-256/384/521 #1602

Merged
merged 3 commits into from
Jun 12, 2024

Conversation

dkostic
Copy link
Contributor

@dkostic dkostic commented May 20, 2024

Issues:

CryptoAlg-2409

Description of changes:

Implement and use a single version of point addition
for implementations of NIST curves P-384, P-521, and
Fiat-crypto based implementation of P-256. The change
does not affect performance.

Call-outs:

I verified the performance was not affected on Graviton 3, Intel, and M1 CPUs. Example for M1:

Before
Did 2885000 EC POINT P-384 dbl operations in 1000089us (2884743.3 ops/sec)
Did 1593000 EC POINT P-384 add operations in 1000248us (1592605.0 ops/sec)
Did 6816 EC POINT P-384 mul operations in 1056843us (6449.4 ops/sec)
Did 28000 EC POINT P-384 mul base operations in 1002634us (27926.4 ops/sec)
Did 5652 EC POINT P-384 mul public operations in 1080149us (5232.6 ops/sec)
Did 2774000 EC POINT P-521 dbl operations in 1000014us (2773961.2 ops/sec)
Did 1429750 EC POINT P-521 add operations in 1000171us (1429505.6 ops/sec)
Did 4752 EC POINT P-521 mul operations in 1028076us (4622.2 ops/sec)
Did 19000 EC POINT P-521 mul base operations in 1022441us (18583.0 ops/sec)
Did 3912 EC POINT P-521 mul public operations in 1059858us (3691.1 ops/sec)

After
Did 2888000 EC POINT P-384 dbl operations in 1000266us (2887232.0 ops/sec)
Did 1550000 EC POINT P-384 add operations in 1000070us (1549891.5 ops/sec)
Did 6768 EC POINT P-384 mul operations in 1053884us (6422.0 ops/sec)
Did 27000 EC POINT P-384 mul base operations in 1005181us (26860.8 ops/sec)
Did 5232 EC POINT P-384 mul public operations in 1039859us (5031.5 ops/sec)
Did 2749500 EC POINT P-521 dbl operations in 1000046us (2749373.5 ops/sec)
Did 1407000 EC POINT P-521 add operations in 1000272us (1406617.4 ops/sec)
Did 4725 EC POINT P-521 mul operations in 1034737us (4566.4 ops/sec)
Did 19000 EC POINT P-521 mul base operations in 1051390us (18071.3 ops/sec)
Did 3685 EC POINT P-521 mul public operations in 1016140us (3626.5 ops/sec)

Testing:

How is this change tested (unit tests, fuzz tests, etc.)? Are there any testing steps to be verified by the reviewer?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

@dkostic dkostic requested a review from a team as a code owner May 20, 2024 21:50
@codecov-commenter
Copy link

codecov-commenter commented May 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.07%. Comparing base (e44fc2c) to head (c6053ae).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1602      +/-   ##
==========================================
- Coverage   78.10%   78.07%   -0.03%     
==========================================
  Files         562      562              
  Lines       94654    94575      -79     
  Branches    13574    13570       -4     
==========================================
- Hits        73927    73838      -89     
- Misses      20133    20145      +12     
+ Partials      594      592       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ctx->sub(y_out, y_out, s1j);
ctx->sub(y_out, y_out, s1j);

cmovznz(x_out, ctx->felem_num_limbs, z1nz, x2, x_out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be slightly more efficient to keep the original order as, for example, x_out would remain in registers/cache before sending it or x1 to x_3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it can make any difference in performance because: 1) the data is so small that everything is in cache anyway; 2) cmovznz is a function call so if not inlined it can't reuse values in registers. I'd like to keep this order for clarity if you don't mind?

@dkostic dkostic merged commit 37ba0e2 into aws:main Jun 12, 2024
93 checks passed
@dkostic dkostic deleted the ec-nistp-refactor-v2 branch June 12, 2024 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants