Merge pull request privacy-scaling-explorations#139 from input-output…

…-hk/dev-docs/add-endoscaling-docs Expand endoscaling documentation
input-output-hk · Mar 14, 2024 · 7a13a02 · 7a13a02
2 parents 10623d5 + 46e6e47
commit 7a13a02
Show file tree

Hide file tree

Showing 3 changed files with 54 additions and 33 deletions.
diff --git a/book/src/design/gadgets/endoscaling.md b/book/src/design/gadgets/endoscaling.md
@@ -3,13 +3,18 @@
 Often in proof systems, it is necessary to multiply a group element by a scalar that depends
 on a challenge. Since the challenge is random, what matters is only that the scalar retains
 that randomness; that is, it is acceptable to apply a 1-1 mapping to the scalar if that allows
-the multiplication to be done more efficiently.
+the multiplication to be done more efficiently (especially when it happens in a circuit).
 
 The Pasta curves (as well as Pluto-Eris) we use for Halo 2 are equipped with an endomorphism that allows such
 efficient multiplication. By allowing a 1-1 mapping as described above, we can avoid having
 to "decompose" the input challenge using an algorithm such as
 [[Pornin2020]](https://eprint.iacr.org/2020/454) that requires lattice basis reduction.
 
+There are two primary settings where endoscaling is typically used. Both use the same basic algorithms, but with inputs drawn from different sources. The two settings are:
+1. When multiplying a challenge by a group element (see [Bowe et al.](https://eprint.iacr.org/2019/1021.pdf), Section 6.2).
+2. To commit to public inputs in an instance column, using fixed Lagrange polynomials as the group elements.
+
+
 ## Definitions
 
 - The Lagrange basis polynomial $\ell_i(X)$ is such that $\ell_i(\omega^i) = 1$ and
@@ -23,6 +28,7 @@ to "decompose" the input challenge using an algorithm such as
   $\zeta_p \in \mathbb{F}_p$. This is equivalent to $\phi(P) = [\zeta_q]P$ for some
   $\zeta_q \in \mathbb{F}_q$ of multiplicative order $3$.
 
+- In pseudocode, the notation $a..b$ means the range that is _inclusive_ of $a$ and _exclusive_ of $b$. This is equivalent to the mathematical notation $[a, b)$.
 
 ### Proof that defining $\phi((x, y)) \triangleq (\zeta_p \cdot x, y)$ implies $\phi(P) = [\zeta_q]P$ for some $\zeta_q$
 
@@ -37,44 +43,55 @@ $$y_r((\zeta_p \cdot x_p, y_p), (\zeta_p \cdot x_q, y_q)) = y_r((x_p, y_p), (x_q
 in both cases, where here $x_r((x_p, y_p), (x_q, y_q))$ and $y_r((x_p, y_p), (x_q, y_q))$ refer to the expressions for the output point coordinates $R = (x_r, y_r)$ in the addition formulas, as functions of the inputs point coordinates $P = (x_p, y_p)$ and $Q = (x_q, y_q)$. $\square$
 </details>
 
-## Endoscaling for public inputs
-
-In the Halo 2 proof system, this technique can optionally be used to commit to an instance
-column using bits that represent the public input. Each basis polynomial corresponds with a
-cell in the column.
-
 ## Computing an endoscaling commitment
 
-Let $N$ be the limit on the number of bits that can be input to endoscaling at once while
-avoiding collisions. For CM curves that have a cubic endomorphism $\phi$, such as the
-Pasta and Pluto-Eris curves, this limit can be computed using the script
-[checksumsets.py in zcash/pasta](https://github.com/zcash/pasta/blob/master/checksumsets.py).
+There are two basic algorithms used in endoscaling computations:
+- Algorithm 1 executes "endoscaling";
+- Algorithm 2 computes an "endoscalar".
+
+The endoscaling optimization takes a random bitstring $\mathbf{r}$ and a group element $G$ and produces a group element $P$. We could write $P$ as the equivalent of doing scalar multiplication, where $n(\mathbf{r})$ is a function over bitstrings that produces a scalar value: $$P = [n(\mathbf{r})]G.$$ Algorithm 1 computes the mapping to $P$ _without_ using traditional scalar multiplication (this is what makes it an optimization). Algorithm 2[^alg2] takes only $\mathbf r$ as input and computes the equivalent scalar value $n(\mathbf{r})$. Note that $n(\mathbf{r})$ does not assume that $\mathbf r$ is a canonical representation of the bits of a scalar; it's a completely different map.
 
-Assume that $N$ is even. (For Pasta, $N = 248$; for Pluto-Eris, $N = 442$.)
+The introduction mentions two settings for using endoscaling. The first takes a challenge as the scalar value and some intermediate circuit value as the group element. The second takes a public instance column as the scalar value and uses Lagrange basis polynomials to determine the group elements; the rest of this section explains how.
 
 Let $\text{Endoscale}$ be Algorithm 1 in the [Halo paper](https://eprint.iacr.org/2019/1021.pdf):
 
 $$
-(\mathbf{r}, G) \mapsto [n(\mathbf{r})] G
+\text{Endoscale}: (\mathbf{r}, G) \mapsto [n(\mathbf{r})] G,
 $$
 
-Given $G_i = \text{Comm}(\ell_i(X))$, we compute an endoscaling instance column commitment by
-calculating the sum $P = \sum_{i = 0}^{m - 1} \text{Endoscale}(\mathbf{r}_i, G_i)$.
+and let $N$ be the limit on the number of bits[^curves] that can be input to endoscaling at once while
+avoiding collisions (that is, $|\mathbf{r}| \le N$); assume $N$ is even.
+
+Given a commitment to a Lagrange basis polynomial $G_i = \text{Comm}(\ell_i(X))$, we can commit to the bits in an instance column $\mathbf{c}$ by
+breaking it into length-$N$ chunks such that $\mathbf{c} = \mathbf{c}_0 || \mathbf{c}_1 || \dots || \mathbf{c}_{m-1}$, and then
+calculating the sum
+$$P = \sum_{i \in [0, m)} \text{Endoscale}(\mathbf{c}_i, G_i),$$
+where $mN$ is an upper bound on the size of the instance column.
+
+The following sections give more detail about using endoscaling in the second setting. In the first setting, the random value is always a challenge value of length $\le N$, so the explanations and costs are essentially the same; just assume that $m = 1$.
+
+[^alg2]: In the original paper, Algorithm 2 is actually an alternative to Algorithm 1: it takes both the challenge and group element as input and outputs the endoscaled point. The naming throughout this book and implementation is due to the fact that Algorithm 2 makes it easier to see how to compute the endoscalar value.
+
+[^curves]: For complex multiplication curves that have a cubic endomorphism $\phi$, such as the Pasta and Pluto-Eris curves, this limit can be computed using the script
+[checksumsets.py in zcash/pasta](https://github.com/zcash/pasta/blob/master/checksumsets.py).
+For Pasta, $N = 248$; for Pluto-Eris, $N = 442$.
 
 ### Algorithm 1 (optimized)
 
 The input bits to endoscaling are $\mathbf{r}$. Split $\mathbf{r}$ into $m$ chunks
 $\mathbf{r}_0, \mathbf{r}_1, ..., \mathbf{r}_{m - 1} \in \{0, 1\}^N$. For now assume that all
-the $\mathbf{r}_i$ are the same length.
-
-let $S(i, j) = \begin{cases}
-  [2\mathbf{r}_{i,2j} - 1] G_i,\text{ if } \mathbf{r}_{i,2j+1} = 0, \\
-  \phi([2\mathbf{r}_{i,2j} - 1] G_i),\text{ otherwise}.
-\end{cases}$
+the $\mathbf{r}_i$ are the same length $N$, and that the length is even.
+For $i \in [0, m)$ and $j \in [0, N/2)$:
+$$ S(i, j) = \begin{cases}
+  [2\mathbf{r}_{i,2j} - 1] G_i, &\text{ if } \mathbf{r}_{i,2j+1} = 0, \\
+  \phi([2\mathbf{r}_{i,2j} - 1] G_i), &\text{ otherwise}.
+\end{cases}
+$$
 
-$P := [2] \sum_{i=0}^{m-1} (G_i + \phi(G_i))$
+Using $G_i$ as defined above, we initialize the accumulator $P$ as:
+$$ P := [2] \sum_{i=0}^{m-1} (G_i + \phi(G_i)) $$
 
-for $j$ in $0..N/2$:
+Then, for $j$ in $[0, N/2)$, we incorporate each chunk of the input bits:
 
 $$
 \begin{array}{l}
@@ -84,7 +101,8 @@ $$
 P := (P \;⸭\; \mathrm{Inner}) \;⸭\; P \\
 \end{array}
 $$
-which is equivalent to (using complete addition)
+
+These two steps are equivalent to (using complete addition):
 
 $$
 \begin{array}{l}
@@ -97,7 +115,7 @@ P := \mathcal{O} \\
 $$
 
 #### Circuit cost
-We decompose each $\mathbf{r}_i$ chunk into two-bit chunks:
+We decompose each $\mathbf{r}_i$ chunk into two-bit chunks $c_j$:
 
 $$
 \mathbf{r} = c_0 + 4 \cdot c_1 + ... + 4^{N/2 - 1} \cdot c_{N/2 -1}
@@ -107,19 +125,19 @@ with a running sum $z_j, j \in [0..(N/2)).$ $z_0$ is initialized as
 $z_0 = \mathbf{r}$. Each subsequent $z_j$ is calculated as:
 $z_j = (z_{j-1} - c_{j-1}) \cdot 2^{-2}$. The final $z_{N/2} = 0$.
 
-Each $c_j$ is further broken down as $c_j = b_{j,0} + 2 \cdot b_{j,1}$.
-The tuple $(b_0, b_1)$ maps to the endoscaled points:
+Each $c_j$ is further broken down into two individual bits as $c_j = b_{j,0} + 2 \cdot b_{j,1}$.
+Based on the equation for $S(i,j)$ defined above and the definition for the cubic endomorphism $\phi$, we can map each tuple $(b_0, b_1)$ to a scaled version of $G$:
 
 $$
 \begin{array}{rl}
   (0, 0) &\rightarrow (G_x, -G_y) \\
   (0, 1) &\rightarrow (\zeta \cdot G_x, -G_y) \\
   (1, 0) &\rightarrow (G_x, G_y) \\
   (1, 1) &\rightarrow (\zeta \cdot G_x, G_y)
-\end{array}
+\end{array}.
 $$
 
-which are accumulated using the [double-and-add](./double-and-add.md) algorithm.
+These are accumulated using the [double-and-add](./double-and-add.md) algorithm (as in the innermost loop of the complete addition version of Algorithm 1).
 
 Let $r$ be the number of incomplete additions we're doing per row. For $r = 1$:
 

diff --git a/halo2_gadgets/src/endoscale.rs b/halo2_gadgets/src/endoscale.rs
@@ -7,7 +7,6 @@ use halo2_proofs::{
 use halo2curves::CurveAffine;
 use std::fmt::Debug;
 
-/// TODO: docs
 pub mod chip;
 
 /// Instructions to map bitstrings to and from endoscalars.

diff --git a/halo2_gadgets/src/endoscale/chip.rs b/halo2_gadgets/src/endoscale/chip.rs
@@ -1,4 +1,4 @@
-/// TODO: docs
+//! Instantiates a chip that implements the [`EndoscaleInstructions`].
 use crate::{ecc::chip::NonIdentityEccPoint, utilities::decompose_running_sum::RunningSumConfig};
 
 use super::EndoscaleInstructions;
@@ -147,9 +147,13 @@ where
     }
 }
 
-/// TODO: docs
+/// Parameters for curves that support endoscaling.
 pub trait CurveEndoscale {
-    /// TODO: docs
+    /// Upper bound on the length of the input bits for endoscaling.
+    ///
+    /// Algorithm 2 defines a mapping from some input bits to an endoscalar.
+    /// In order to ensure the mapping does not have collisions (where different inputs map to the
+    /// same endoscalar), there is an upper bound on the length of the input.
     const MAX_BITSTRING_LENGTH: usize;
 }