Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#711 + #713 remove num chunks from public input; relax the constraints for the dummy chunk proofs to avoid dummy chunk proofgen time #712

Merged
merged 37 commits into from
Aug 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
f3cc71f
partiall address review comments of #670
zhenfeizhang Aug 1, 2023
1f173b8
disable disable_proof_agg by default
zhenfeizhang Aug 1, 2023
2aeb3e0
fix soundness issues with data hash
zhenfeizhang Aug 1, 2023
ac9c4ee
add fixed cells for chunk_is_valid
zhenfeizhang Jul 28, 2023
38d14c1
fix typo
zhenfeizhang Jul 28, 2023
08f21d0
well... need morning coffee
zhenfeizhang Jul 28, 2023
9c84798
typo in readme
zhenfeizhang Jul 28, 2023
845ee0c
fix soundness issue in is_smaller_than
zhenfeizhang Jul 28, 2023
311b6dc
[feat] unify u8 u16 lookup table (#694)
lightsing Aug 2, 2023
d6eb833
opt min_num_rows_block in poseidon_circuit (#700)
noel2004 Aug 2, 2023
5658bc5
speedup ci using verify_at_rows (#703)
lispc Aug 2, 2023
3097221
update super circuit row estimation API (#695)
lispc Aug 2, 2023
8319ade
remove num_of_chunks in aggregation circuit's instance column (#704)
lispc Aug 2, 2023
29eb9c3
fix: lost tx_type when doing type conversion (#710)
kunxian-xia Aug 2, 2023
818dc78
fix condition (#708)
lightsing Aug 2, 2023
8a0a908
gates for zero checks
zhenfeizhang Aug 2, 2023
cb0f622
statement 1 is correct
zhenfeizhang Aug 2, 2023
539d63c
reenable statements 3,6,7
zhenfeizhang Aug 2, 2023
8182446
reenable statement 4
zhenfeizhang Aug 2, 2023
6e69d1e
everything seems to work again
zhenfeizhang Aug 2, 2023
dd9d01f
update aggregation test accordingly
zhenfeizhang Aug 2, 2023
ec95160
update spec
zhenfeizhang Aug 2, 2023
4154a6b
minor clean up
zhenfeizhang Aug 2, 2023
407be9b
Merge remote-tracking branch 'origin/develop' into 711-remove-num_chu…
silathdiir Aug 3, 2023
70dd136
Fix fmt.
silathdiir Aug 3, 2023
60f05f2
Make `ChunkHash` fields public.
silathdiir Aug 3, 2023
00ba5ff
fix decompose function
zhenfeizhang Aug 4, 2023
5325f45
update figures
zhenfeizhang Aug 4, 2023
4f6eac8
clean up
zhenfeizhang Aug 4, 2023
f58aa57
Merge branch 'develop' into 711-remove-num_chunks-from-public-input
silathdiir Aug 4, 2023
0446a28
Merge branch 'develop' into 711-remove-num_chunks-from-public-input
kunxian-xia Aug 4, 2023
246b471
address comments
zhenfeizhang Aug 4, 2023
e491a92
Merge branch '711-remove-num_chunks-from-public-input' of github.com:…
zhenfeizhang Aug 4, 2023
2401449
add is_final checks
zhenfeizhang Aug 4, 2023
9da7fc3
update readme
zhenfeizhang Aug 4, 2023
80861c1
constraint hash input length
zhenfeizhang Aug 4, 2023
aa79cc8
fix clippy
zhenfeizhang Aug 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion aggregator/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ snark-verifier-sdk = { git = "https://github.com/scroll-tech/snark-verifier", br


[features]
default = [ ]
default = [ ]
print-trace = [ "ark-std/print-trace" ]
# This feature is useful for unit tests where we check the SAT of pi aggregation circuit
disable_proof_aggregation = []
141 changes: 26 additions & 115 deletions aggregator/README.md
kunxian-xia marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -2,64 +2,6 @@ Proof Aggregation
-----

![Architecture](./figures/architecture.jpg)
<!--
This repo does proof aggregations for zkEVM proofs.

## zkEVM circuit
A zkEVM circuits generates a ZK proof for a chunk of blocks. It takes 64 field elements as its public input, consist of
- chunk's data hash digest: each byte is encoded in an Fr element
- chunk's public input hash digest: each byte is encoded in an Fr element
The total size for a public input is 64 bytes, encoded in 64 Fr element

For the ease of testing, this repo implements a `MockCircuit` which hash same public input APIs as a zkEVM circuit.

## First compression circuit
The first compression circuit takes in a fresh snark proof and generates a new (potentially small) snark proof.
The public inputs to the new snark proof consists of
- 12 elements from the accumulators
- an accumulator consists of 2 G1 elements, which are the left and right inputs to the pairing
- this is treated as 4 Fq elements, each decomposed into 3 limbs and encoded in Fr
- 64 elements from previous snark
- re-expose the same public inputs as the original snark

The first compression circuit is configured [wide config file](./configs/compression_wide.config).

## Second compression circuit

The second compression circuit takes in a compressed snark proof and generates a new (potentially small) snark proof.
The public inputs to the new snark proof consists of
- 12 elements from the accumulators
- an accumulator consists of 2 G1 elements, which are the left and right inputs to the pairing
- this is treated as 4 Fq elements, each decomposed into 3 limbs and encoded in Fr
- accumulator from the previous snark is accumulated into the current accumulator
- 64 elements from previous snark
- skipping the first 12 elements which are previous accumulator, as they are already accumulated
- re-expose the rest 64 field elements as the public inputs

The second compression circuit is configured [thin config file](./configs/compression_thin.config).

## Aggregation circuit
An aggregation circuit takes in a batch of `k` proofs, each for a chunk of blocks.
It generates a single proof asserting the validity of all the proofs.

It also performs public input aggregation, i.e., reducing the `64k` public elements into a fixed number of `144` elements:
- 12 elements from accumulators, which accumulates all the previous `k` accumulators from each snark
- 132 elements from the hashes
- first_chunk_prev_state_root: 32 Field elements
- last_chunk_post_state_root: 32 Field elements
- last_chunk_withdraw_root: 32 Field elements
- batch_public_input_hash: 32 Field elements
- chain_id: 8 Field elements

In addition, it attests that, for chunks indexed from `0` to `k-1`,
- batch_data_hash := keccak(chunk_0.data_hash || ... || chunk_k-1.data_hash) where chunk_i.data_hash is a public input to the i-th batch snark circuit
- chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash) where chunk_data_hash is a public input to the i-th batch snark circuit
- and the related field matches public input

See [public input aggregation](./src/proof_aggregation/public_input_aggregation.rs) for the details of public input aggregation. -->

<!-- # Spec for Dynamic aggregator -->

# Params
|param|meaning |
|:---:|:---|
Expand Down Expand Up @@ -93,17 +35,17 @@ c_i.post_state_root == c_{i+1}.prev_state_root
```
for $i \in [1, k-1]$.

## Empty chunk
An __empty chunk__ is a chunk that does not contain any transactions. It is used for padding.
If $k< n$, $(n-k)$ empty chunks are padded to the list. An empty chunk has the same data fields as a real chunk, and the parameters are set as
- state root before this chunk: `c_k.post_state_root`
- state root after this chunk: `c_k.post_state_root`
- the withdraw root of this chunk: `c_k.withdraw_root`
- the data hash of this chunk: `keccak("")`
## Padded chunk
A __padded chunk__ is a chunk that repeats last valid chunk. It is used for padding.
If $k< n$, $(n-k)$ padded chunks are padded to the list. A padded chunk has the same data fields as the last real chunk, and the parameters are set as
- state root before this chunk: `c_{k}.prev_state_root`
- state root after this chunk: `c_{k}.post_state_root`
- the withdraw root of this chunk: `c_{k}.withdraw_root`
- the data hash of this chunk: `c_{k}.data_hash`

## Batch

A __batch__ consists of continuous chunks of size `n`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` empty chunks using the above logic.
A __batch__ consists of continuous chunks of size `k`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` chunks identical to `chunk[k]`.

# Circuits

Expand All @@ -114,20 +56,16 @@ Circuit proving the relationship for a chunk is indeed the zkEVM circuit. It wil
- 12 from accumulators
- 32 from public input hash

## Empty chunk circuit
An empty chunk circuit also takes 44 elements as public inputs.
In our design it is curial that __a same circuit__ is used for both real chunk circuit and empty chunk circuit. In other words, an empty chunk circuit will also go through the same compressions before it is aggregated.


![Architecture](./figures/hashes.jpg)

## Aggregation Circuit

We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` empty chunks, and obtain a total of `n` snarks.
We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` padded chunks, and obtain a total of `n` snarks.

In the above example, we have `k = 2` valid chunks, and `2` empty chunks.
In the above example, we have `k = 2` valid chunks, and `2` padded chunks.

> Interlude: we just need to generate 1 empty snark, and the rest `n-k-1` will be identical for the same batch. We cannot pre-compute it though, as the witness `c_k.post_state_root` and `c_k.withdraw_root` are batch dependent.
The padded snarks are identical the the last valid snark, so the aggregator does not need to generate snarks for padded chunks.

### Configuration

Expand All @@ -140,7 +78,6 @@ There will be three configurations for Aggregation circuit.
The public input of the aggregation circuit consists of
- 12 elements from accumulator
- 32 elements of `batch_pi_hash`
- 1 element of `k`

### Statements
For snarks $s_1,\dots,s_k,\dots, s_n$ the aggregation circuit argues the following statements.
Expand All @@ -162,9 +99,9 @@ for i in 1 ... __n__

This is done by compute the RLCs of chunk[i]'s data_hash for `i=0..k`, and then check the RLC matches the one from the keccak table.

4. chunks are continuous: they are linked via the state roots. __Static__.
4. chunks are continuous when they are not padded: they are linked via the state roots.

for i in 1 ... __n-1__
for i in 1 ... __k-1__
zhenfeizhang marked this conversation as resolved.
Show resolved Hide resolved
```
c_i.post_state_root == c_{i+1}.prev_state_root
```
Expand All @@ -175,17 +112,18 @@ for i in 1 ... __n__
batch.chain_id == chunk[i].chain_id
```

6. The last `(n-k)` chunk[i]'s prev_state_root == post_state_root when chunk[i] is padded
6. The last `(n-k)` chunk[i] are padding
```
for i in 1 ... n:
is_padding = (i > k) // k is a public input
if is_padding:
chunk_i.prev_state_root == chunk_i.post_state_root
chunk_i.withdraw_root == chunk_{i-1}.withdraw_root
chunk_i.data_hash == [0u8; 32]
chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells
```
7. chunk[i]'s data_hash len is `0` when chunk[i] is padded

This is done via comparing the `data_rlc` of `chunk_{i-1}` and ` chunk_{i}`.
7. the hash input length are correct
- first MAX_AGG_SNARKS + 1 hashes all have 136 bytes input
- batch's data_hash length is 32 * number_of_valid_snarks
8. batch data hash is correct w.r.t. its RLCs
9. is_final_cells are set correctly

### Handling dynamic inputs

Expand All @@ -201,11 +139,13 @@ Suppose we target for `MAX_AGG_SNARK = 10`. Then, the last hash function will ta
We also know in the circuit if a chunk is an empty one or not. This is given by a flag `is_padding`.

For the input of the final data hash
- we extract `32 * MAX_AGG_SNARK` number of cells (__static__ here) from the last hash. We then compute the RLC of those `32 * MAX_AGG_SNARK` when the corresponding `is_padding` is not set. We constraint this RLC matches the `data_rlc` from the keccak table.

- we extract `32 * MAX_AGG_SNARK` number of cells (__static__ here) from the last hash. We then compute the RLC of those `32 * MAX_AGG_SNARK` when the corresponding `is_padding` is not set. We constrain this RLC matches the `data_rlc` from the keccak table.

For the output of the final data hash
- we extract all three hash digest cells from last 3 rounds. We then constraint that the actual data hash matches one of the three hash digest cells with proper flags defined as follows.
- if the num_of_valid_snarks <= 4, which only needs 1 keccak-f round. Therefore the batch's data hash (input, len, data_rlc, output_rlc) are in the first 300 keccak rows;
- else if the num_of_valid_snarks <= 8, which needs 2 keccak-f rounds. Therefore the batch's data hash (input, len, data_rlc, output_rlc) are in the 2nd 300 keccak rows;
- else the num_of_valid_snarks <= 12, which needs 3 keccak-f rounds. Therefore the batch's data hash (input, len, data_rlc, output_rlc) are in the 3rd 300 keccak rows;

|#valid snarks | offset of data hash | flags|
|---| ---| ---|
Expand All @@ -214,33 +154,4 @@ For the output of the final data hash
|9,10 | 64 | 0, 0, 1|

Additional checks for dummy chunk
- if `is_padding` for `i`-th chunk, we constrain `chunk[i].prev_state_root = chunk[i].post_state_root`
- if `is_padding` for `i`-th chunk, we constrain `chunk[i-1].withdraw_root = chunk[i].withdraw_root`
- if `is_padding` for `i`-th chunk, we constrain `chunk[i-1].data_hash.len() == 0`

<!--
1. Extact the final `data_rlc` cell from each round. There are maximum $t$ of this, denoted by $r_1,\dots r_t$
- __caveat__: will need to make sure the circuit is padded as if there are $t$ rounds, if the actual number of rounds is less than $t$. This is done by keccak table already:
all columns of keccak table are padded to `1<<LOG_DEGREE` by construction (__need to double check this is circuit dependent__)
2. Extract a challenge and then compute `rlc:= RLC(chunk_1.data_hash || ... || chunk_k.data_hash)` using a __phase 2__ column
3. assert `rlc` is valid via a lookup argument
- constrain `rlc` cell is within the "data_rlc" column of keccak table via standard lookup API
- potential optimization: avoid using lookup API. There is only $t$ elements as $rlc \in \{r_1,\dots r_t\}$ and we may check equality one by one.
-->

<!--
Circuit witnesses:
- a list of k __real__ CHUNKs, each with 44 elements of public inputs (12 from accumulators and
32 from public input hash)
-
- Those 4 hashes are obtained from the caller.
- It's public input hash is
- chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root ||
chunk_data_hash)
Circuit public inputs:
- an accumulator of 12 elements
- a batch public input hash of 32 elements
- the value k, 1 element

The aggregation circuit aggregates MAX_AGG_NUM snarks.
If k < MAX_AGG_NUM, dummy snarks will be padded -->
- if `is_padding` for `i`-th chunk, we constrain `chunk[i]'s chunk_pi_hash_rlc_cells == chunk[i-1].chunk_pi_hash_rlc_cells`
Binary file modified aggregator/figures/architecture.jpg
zhenfeizhang marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified aggregator/figures/hashes.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 12 additions & 6 deletions aggregator/src/aggregation/circuit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ impl Circuit<Fr> for AggregationCircuit {

let timer = start_timer!(|| "load aux table");

let (hash_digest_cells, num_valid_snarks) = {
let hash_digest_cells = {
config
.keccak_circuit_config
.load_aux_tables(&mut layouter)?;
Expand All @@ -269,16 +269,22 @@ impl Circuit<Fr> for AggregationCircuit {
end_timer!(timer);

let timer = start_timer!(|| ("assign hash cells").to_string());
let (hash_digest_cells, num_valid_snarks) = assign_batch_hashes(
let chunks_are_valid = self
.batch_hash
.chunks_with_padding
.iter()
.map(|chunk| !chunk.is_padding)
.collect::<Vec<_>>();
let hash_digest_cells = assign_batch_hashes(
&config,
&mut layouter,
challenges,
&chunks_are_valid,
&preimages,
self.batch_hash.number_of_valid_chunks,
)
.map_err(|_e| Error::ConstraintSystemFailure)?;
end_timer!(timer);
(hash_digest_cells, num_valid_snarks)
hash_digest_cells
};
// digests
let (batch_pi_hash_digest, chunk_pi_hash_digests, _potential_batch_data_hash_digest) =
Expand Down Expand Up @@ -331,6 +337,8 @@ impl Circuit<Fr> for AggregationCircuit {
);

region.constrain_equal(
// in the keccak table, the input and output date have different
// endianess
chunk_pi_hash_digests[i][j * 8 + k].cell(),
snark_inputs[i * DIGEST_LEN + (3 - j) * 8 + k].cell(),
)?;
Expand Down Expand Up @@ -371,8 +379,6 @@ impl Circuit<Fr> for AggregationCircuit {
}
}

log::trace!("number of valid snarks: {:?}", num_valid_snarks.value());

end_timer!(witness_time);
Ok(())
}
Expand Down
2 changes: 2 additions & 0 deletions aggregator/src/aggregation/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@ impl AggregationConfig {
meta.enable_equality(keccak_circuit_config.keccak_table.input_rlc);
// enable equality for the input data len column
meta.enable_equality(keccak_circuit_config.keccak_table.input_len);
// enable equality for the is_final column
meta.enable_equality(keccak_circuit_config.keccak_table.is_final);

// Instance column stores public input column
// - the accumulator
Expand Down
2 changes: 1 addition & 1 deletion aggregator/src/aggregation/rlc/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ impl RlcConfig {
let q2 = meta.query_selector(enable_challenge);
let cs2 = q2 * (a - challenge_expr.keccak_input());

vec![cs1 + cs2]
vec![cs1, cs2]
});
Self {
#[cfg(test)]
Expand Down
Loading