scroll-tech · lispc · Aug 4, 2023 · Aug 1, 2023 · Aug 1, 2023 · Aug 1, 2023
diff --git a/aggregator/Cargo.toml b/aggregator/Cargo.toml
@@ -25,7 +25,7 @@ snark-verifier-sdk = { git = "https://github.com/scroll-tech/snark-verifier", br
 
 
 [features]
-default = [  ]
+default = [ ]
 print-trace = [ "ark-std/print-trace" ]
 # This feature is useful for unit tests where we check the SAT of pi aggregation circuit
 disable_proof_aggregation = []
diff --git a/aggregator/README.md b/aggregator/README.md
@@ -2,64 +2,6 @@ Proof Aggregation
 -----
 
 ![Architecture](./figures/architecture.jpg)
-<!-- 
-This repo does proof aggregations for zkEVM proofs.
-
-## zkEVM circuit
-A zkEVM circuits generates a ZK proof for a chunk of blocks. It takes 64 field elements as its public input, consist of 
-- chunk's data hash digest: each byte is encoded in an Fr element
-- chunk's public input hash digest: each byte is encoded in an Fr element
-The total size for a public input is 64 bytes, encoded in 64 Fr element
-
-For the ease of testing, this repo implements a `MockCircuit` which hash same public input APIs as a zkEVM circuit. 
-
-## First compression circuit
-The first compression circuit takes in a fresh snark proof and generates a new (potentially small) snark proof. 
-The public inputs to the new snark proof consists of 
-- 12 elements from the accumulators
-    - an accumulator consists of 2 G1 elements, which are the left and right inputs to the pairing
-    - this is treated as 4 Fq elements, each decomposed into 3 limbs and encoded in Fr  
-- 64 elements from previous snark
-    - re-expose the same public inputs as the original snark
-
-The first compression circuit is configured [wide config file](./configs/compression_wide.config).
-
-## Second compression circuit
-
-The second compression circuit takes in a compressed snark proof and generates a new (potentially small) snark proof. 
-The public inputs to the new snark proof consists of 
-- 12 elements from the accumulators
-    - an accumulator consists of 2 G1 elements, which are the left and right inputs to the pairing
-    - this is treated as 4 Fq elements, each decomposed into 3 limbs and encoded in Fr  
-    - accumulator from the previous snark is accumulated into the current accumulator
-- 64 elements from previous snark
-    - skipping the first 12 elements which are previous accumulator, as they are already accumulated
-    - re-expose the rest 64 field elements as the public inputs 
-
-The second compression circuit is configured [thin config file](./configs/compression_thin.config).
-
-## Aggregation circuit
-An aggregation circuit takes in a batch of `k` proofs, each for a chunk of blocks. 
-It generates a single proof asserting the validity of all the proofs. 
-
-It also performs public input aggregation, i.e., reducing the `64k` public elements  into a fixed number of `144` elements:
-- 12 elements from accumulators, which accumulates all the previous `k` accumulators from each snark
-- 132 elements from the hashes
-    - first_chunk_prev_state_root: 32 Field elements
-    - last_chunk_post_state_root: 32 Field elements
-    - last_chunk_withdraw_root: 32 Field elements
-    - batch_public_input_hash: 32 Field elements
-    - chain_id: 8 Field elements
-
-In addition, it attests that, for chunks indexed from `0` to `k-1`,
-- batch_data_hash := keccak(chunk_0.data_hash || ... || chunk_k-1.data_hash) where chunk_i.data_hash is a public input to the i-th batch snark circuit
-- chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash) where chunk_data_hash is a public input to the i-th batch snark circuit
-- and the related field matches public input
-
-See [public input aggregation](./src/proof_aggregation/public_input_aggregation.rs) for the details of public input aggregation. -->
-
-<!-- # Spec for Dynamic aggregator -->
-
 # Params
 |param|meaning |
 |:---:|:---|
@@ -93,17 +35,17 @@ c_i.post_state_root == c_{i+1}.prev_state_root
 ```
 for $i \in [1, k-1]$.
 
-## Empty chunk
-An __empty chunk__ is a chunk that does not contain any transactions. It is used for padding. 
-If $k< n$, $(n-k)$ empty chunks are padded to the list. An empty chunk has the same data fields as a real chunk, and the parameters are set as
-- state root before this chunk: `c_k.post_state_root`
-- state root after this chunk: `c_k.post_state_root`
-- the withdraw root of this chunk: `c_k.withdraw_root`
-- the data hash of this chunk: `keccak("")`
+## Padded chunk
+A __padded chunk__ is a chunk that repeats last valid chunk. It is used for padding. 
+If $k< n$, $(n-k)$ padded chunks are padded to the list. A padded chunk has the same data fields as the last real chunk, and the parameters are set as
+- state root before this chunk: `c_{k}.prev_state_root`
+- state root after this chunk: `c_{k}.post_state_root`
+- the withdraw root of this chunk: `c_{k}.withdraw_root`
+- the data hash of this chunk: `c_{k}.data_hash`
 
 ## Batch
 
-A __batch__ consists of continuous chunks of size `n`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` empty chunks using the above logic.
+A __batch__ consists of continuous chunks of size `k`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`.
 
 # Circuits
 
@@ -114,20 +56,16 @@ Circuit proving the relationship for a chunk is indeed the zkEVM circuit. It wil
     - 12 from accumulators
     - 32 from public input hash
 
-## Empty chunk circuit
-An empty chunk circuit also takes 44 elements as public inputs. 
-In our design it is curial that __a same circuit__ is used for both real chunk circuit and empty chunk circuit. In other words, an empty chunk circuit will also  go through the same compressions before it is aggregated. 
-
 
 ![Architecture](./figures/hashes.jpg)
 
 ## Aggregation Circuit
 
-We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` empty chunks, and obtain a total of `n` snarks. 
+We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` padded chunks, and obtain a total of `n` snarks. 
 
-In the above example, we have `k = 2` valid chunks, and `2` empty chunks.
+In the above example, we have `k = 2` valid chunks, and `2` padded chunks.
 
-> Interlude: we just need to generate 1 empty snark, and the rest `n-k-1` will be identical for the same batch. We cannot pre-compute it though, as the witness `c_k.post_state_root` and `c_k.withdraw_root` are batch dependent.
+The padded snarks are identical the the last valid snark, so the aggregator does not need to generate snarks for padded chunks.
 
 ### Configuration
 
@@ -140,7 +78,6 @@ There will be three configurations for Aggregation circuit.
 The public input of the aggregation circuit consists of
 - 12 elements from accumulator
 - 32 elements of `batch_pi_hash`
-- 1 element of `k`
 
 ### Statements
 For snarks $s_1,\dots,s_k,\dots, s_n$ the aggregation circuit argues the following statements.
@@ -162,9 +99,9 @@ for i in 1 ... __n__
 
 This is done by compute the RLCs of chunk[i]'s data_hash for `i=0..k`, and then check the RLC matches the one from the keccak table.
 
-4. chunks are continuous: they are linked via the state roots. __Static__.
+4. chunks are continuous when they are not padded: they are linked via the state roots.
 
-for i in 1 ... __n-1__
+for i in 1 ... __k-1__
 ```
 c_i.post_state_root == c_{i+1}.prev_state_root
 ```
@@ -175,17 +112,18 @@ for i in 1 ... __n__
     batch.chain_id == chunk[i].chain_id
 ```
 
-6. The last `(n-k)` chunk[i]'s prev_state_root == post_state_root when chunk[i] is padded
+6. The last `(n-k)` chunk[i] are padding
 ```
 for i in 1 ... n:
-    is_padding = (i > k) // k is a public input
     if is_padding:
-        chunk_i.prev_state_root == chunk_i.post_state_root 
-        chunk_i.withdraw_root == chunk_{i-1}.withdraw_root
-        chunk_i.data_hash == [0u8; 32]
+        chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells
 ```
-7. chunk[i]'s data_hash len is `0` when chunk[i] is padded
-
+This is done via comparing the `data_rlc` of `chunk_{i-1}` and ` chunk_{i}`.
+7. the hash input length are correct
+- first MAX_AGG_SNARKS + 1 hashes all have 136 bytes input
+- batch's data_hash length is 32 * number_of_valid_snarks
+8. batch data hash is correct w.r.t. its RLCs
+9. is_final_cells are set correctly
 
 ### Handling dynamic inputs
 
@@ -201,11 +139,13 @@ Suppose we target for `MAX_AGG_SNARK = 10`. Then, the last hash function will ta
 We also know in the circuit if a chunk is an empty one or not. This is given by a flag `is_padding`. 
 
 For the input of the final data hash
-- we extract `32 * MAX_AGG_SNARK` number of cells (__static__ here) from the last hash. We then compute the RLC of those `32 * MAX_AGG_SNARK` when the corresponding `is_padding` is not set. We constraint this RLC matches the `data_rlc` from the keccak table.
-
+- we extract `32 * MAX_AGG_SNARK` number of cells (__static__ here) from the last hash. We then compute the RLC of those `32 * MAX_AGG_SNARK` when the corresponding `is_padding` is not set. We constrain this RLC matches the `data_rlc` from the keccak table.
 
 For the output of the final data hash
 - we extract all three hash digest cells from last 3 rounds. We then constraint that the actual data hash matches one of the three hash digest cells with proper flags defined as follows.
+    - if the num_of_valid_snarks <= 4, which only needs 1 keccak-f round. Therefore the batch's data hash (input, len, data_rlc, output_rlc) are in the first 300 keccak rows;
+    - else if the num_of_valid_snarks <= 8, which needs 2 keccak-f rounds. Therefore the batch's data hash (input, len, data_rlc, output_rlc) are in the 2nd 300 keccak rows;
+    - else the num_of_valid_snarks <= 12, which needs 3 keccak-f rounds. Therefore the batch's data hash (input, len, data_rlc, output_rlc) are in the 3rd 300 keccak rows;
 
 |#valid snarks | offset of data hash | flags|
 |---| ---| ---|
@@ -214,33 +154,4 @@ For the output of the final data hash
 |9,10          | 64                  | 0, 0, 1|
 
 Additional checks for dummy chunk
-- if `is_padding` for `i`-th chunk, we constrain `chunk[i].prev_state_root = chunk[i].post_state_root`
-- if `is_padding` for `i`-th chunk, we constrain `chunk[i-1].withdraw_root = chunk[i].withdraw_root`
-- if `is_padding` for `i`-th chunk, we constrain `chunk[i-1].data_hash.len() == 0`
-
-<!-- 
-1. Extact the final `data_rlc` cell from each round. There are maximum $t$ of this, denoted by $r_1,\dots r_t$
-    - __caveat__: will need to make sure the circuit is padded as if there are $t$ rounds, if the actual number of rounds is less than $t$. This is done by keccak table already: 
-    all columns of keccak table are padded to `1<<LOG_DEGREE` by construction (__need to double check this is circuit dependent__)
-2. Extract a challenge and then compute `rlc:= RLC(chunk_1.data_hash || ... || chunk_k.data_hash)` using a __phase 2__ column
-3. assert `rlc` is valid via a lookup argument
-    - constrain `rlc` cell is within the "data_rlc" column of keccak table via standard lookup API
-    - potential optimization: avoid using lookup API. There is only $t$ elements as $rlc \in \{r_1,\dots r_t\}$ and we may check equality one by one.
- -->
-
-<!-- 
-Circuit witnesses:
-- a list of k __real__ CHUNKs, each with 44 elements of public inputs (12 from accumulators and
-32 from public input hash)
-    - 
-    - Those 4 hashes are obtained from the caller.
-    - It's public input hash is 
-        - chunk_pi_hash   := keccak(chain_id || prev_state_root || post_state_root || withdraw_root ||
-            chunk_data_hash)
-Circuit public inputs:
-- an accumulator of 12 elements
-- a batch public input hash of 32 elements
-- the value k, 1 element
-
-The aggregation circuit aggregates MAX_AGG_NUM snarks.
-If k < MAX_AGG_NUM, dummy snarks will be padded -->
+- if `is_padding` for `i`-th chunk, we constrain `chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells`
diff --git a/aggregator/figures/architecture.jpg b/aggregator/figures/architecture.jpg
diff --git a/aggregator/figures/hashes.jpg b/aggregator/figures/hashes.jpg
diff --git a/aggregator/src/aggregation/circuit.rs b/aggregator/src/aggregation/circuit.rs
@@ -249,7 +249,7 @@ impl Circuit<Fr> for AggregationCircuit {
 
         let timer = start_timer!(|| "load aux table");
 
-        let (hash_digest_cells, num_valid_snarks) = {
+        let hash_digest_cells = {
             config
                 .keccak_circuit_config
                 .load_aux_tables(&mut layouter)?;
@@ -269,16 +269,22 @@ impl Circuit<Fr> for AggregationCircuit {
             end_timer!(timer);
 
             let timer = start_timer!(|| ("assign hash cells").to_string());
-            let (hash_digest_cells, num_valid_snarks) = assign_batch_hashes(
+            let chunks_are_valid = self
+                .batch_hash
+                .chunks_with_padding
+                .iter()
+                .map(|chunk| !chunk.is_padding)
+                .collect::<Vec<_>>();
+            let hash_digest_cells = assign_batch_hashes(
                 &config,
                 &mut layouter,
                 challenges,
+                &chunks_are_valid,
                 &preimages,
-                self.batch_hash.number_of_valid_chunks,
             )
             .map_err(|_e| Error::ConstraintSystemFailure)?;
             end_timer!(timer);
-            (hash_digest_cells, num_valid_snarks)
+            hash_digest_cells
         };
         // digests
         let (batch_pi_hash_digest, chunk_pi_hash_digests, _potential_batch_data_hash_digest) =
@@ -331,6 +337,8 @@ impl Circuit<Fr> for AggregationCircuit {
                             );
 
                             region.constrain_equal(
+                                // in the keccak table, the input and output date have different
+                                // endianess
                                 chunk_pi_hash_digests[i][j * 8 + k].cell(),
                                 snark_inputs[i * DIGEST_LEN + (3 - j) * 8 + k].cell(),
                             )?;
@@ -371,8 +379,6 @@ impl Circuit<Fr> for AggregationCircuit {
             }
         }
 
-        log::trace!("number of valid snarks: {:?}", num_valid_snarks.value());
-
         end_timer!(witness_time);
         Ok(())
     }

diff --git a/aggregator/src/aggregation/config.rs b/aggregator/src/aggregation/config.rs
@@ -101,6 +101,8 @@ impl AggregationConfig {
         meta.enable_equality(keccak_circuit_config.keccak_table.input_rlc);
         // enable equality for the input data len column
         meta.enable_equality(keccak_circuit_config.keccak_table.input_len);
+        // enable equality for the is_final column
+        meta.enable_equality(keccak_circuit_config.keccak_table.is_final);
 
         // Instance column stores public input column
         // - the accumulator

diff --git a/aggregator/src/aggregation/rlc/config.rs b/aggregator/src/aggregation/rlc/config.rs
@@ -64,7 +64,7 @@ impl RlcConfig {
             let q2 = meta.query_selector(enable_challenge);
             let cs2 = q2 * (a - challenge_expr.keccak_input());
 
-            vec![cs1 + cs2]
+            vec![cs1, cs2]
         });
         Self {
             #[cfg(test)]