From 6307c19ac1a22cbec154fb7cbb584ec51ec965b6 Mon Sep 17 00:00:00 2001
From: Hamy Ratoanina <hamy.ratoanina@toposware.com>
Date: Thu, 9 Nov 2023 14:27:01 -0500
Subject: [PATCH 1/3] Add doc file

---
 evm/src/cpu/docs/memory-handling.md | 31 +++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
 create mode 100644 evm/src/cpu/docs/memory-handling.md

diff --git a/evm/src/cpu/docs/memory-handling.md b/evm/src/cpu/docs/memory-handling.md
new file mode 100644
index 0000000000..fa37b4dd72
--- /dev/null
+++ b/evm/src/cpu/docs/memory-handling.md
@@ -0,0 +1,31 @@
+# Memory structure
+The CPU interacts with the EVM memory via memory channels. At each CPU row, a memory channel can execute a write, a read, or be disabled. A full memory channel is composed of:
+- 1 `used` flag. If it's set to `true`, a memory operation is executed in this channel at this row. If it's set to `false`, no operation is done but its columns might be reused for other purposes.
+- 1 `is_read` flag. It indicates if a memory operation is a read or a write.
+- 3 address columns. A memory address is made of three parts: `context`, `segment` and `virtual`.
+- 8 value columns. EVM words are 256 bits long, and they are broken down in 8 32-bit limbs.
+The last memory channel is `lv.partial_channel`: it doesn't have its own column values and shares them with the first memory channel `lv.mem_channels[0]`. This allows use to save eight columns.
+
+# Top of the stack
+The majority of memory operations involve the stack. The stack is a segment in memory, and stack operations (popping or pushing) use the memory channels. Every CPU instruction does between 0 and 4 pops, and between 0 and 1 push. However, for efficiency purposes, we hold the top of the stack in `lv.mem_channels[0]`, only writing it in memory if necessary.
+
+## Motivation
+See https://github.com/0xPolygonZero/plonky2/issues/1149.
+
+## Top reading and writing
+When a CPU instruction modifies the stack, it must update the top of the stack accordingly. There are three cases.
+
+- **The instruction pops and pushes:** The new top of the stack is calculated by the instruction and stored in `nv.mem_channels[0]`; it may  be a memory read, and the instruction is responsible for setting `nv.mem_channels[0].used` correctly. The previous top of the stack is discarded and doesn't need to be written in memory.
+
+- **The instruction pushes, but doesn't pop:** The new top of the stack is calculated by the instruction and stored in `nv.mem_channels[0]`; it may  be a memory read, and the instruction is responsible for setting `nv.mem_channels[0].used` correctly. If the stack wasn't empty (`lv.stack_len > 0`), the current top of the stack is stored with a memory read in `lv.partial_channel`, which shares its values with `lv.mem_channels[0]` (which holds the current top of the stack). If the stack was empty, `lv.partial_channel` is disabled.
+
+- **The instruction pops, but doesn't push:** The current top of the stack is discarded. If the stack isn't now empty (`lv.stack_len > num_pops`), the new top of the stack is set in `nv.mem_channels[0]` with a memory read from the stack segment. If the stack is now empty, `nv.mem_channels[0]` is disabled.
+
+In the last two cases, there is an edge case if `lv.stack_len` is equal to a `special_len`: `0` for a strictly pushing instruction, `num_pops` for a strictly popping instruction. Note that we do not need to check for values below `num_pops`, since this would be a stack underflow exception which is handled separately. We detect this edge case with the compound flag `1 - (lv.stack_len - special_len) * stack_inv_aux`, where `stack_inv_aux` is constrained to be the modular inverse of `(lv.stack_len - special_len)` if it's non-zero, or `0` otherwise. The flag is `1` if `stack_len` is equal to `special_len`, and `0` otherwise.
+
+This logic can be found in code in the `eval_packed_one` function of `stack.rs`, which multiplies all of the constraints with a degree 1 filter passed as argument.
+
+## Operation flag merging
+To reduce the total number of columns, many operation flags are merged together (for example DUP and SWAP). If the two instructions have different stack behaviors, this can be a problem: the filter for an individual operation is now of degree 2 (e.g. `lv.op.dup_swap * lv.opcode_bits[4]` for SWAP) and some constraints of `eval_packed_one` are already of degree 3.
+
+When this happens, stack constraints are defined manually in the operation's dedicated file (e.g. `dup_swap.rs`). Implementation details vary and can be found in the files; in general they make use of the `stack_inv_aux` and `stack_inv_aux_2` columns to lower the degree of the constraints.

From 7f16e8cd38cc5fc277a77a36bbad6da58dc286b0 Mon Sep 17 00:00:00 2001
From: Hamy Ratoanina <hamy.ratoanina@toposware.com>
Date: Fri, 10 Nov 2023 16:36:16 -0500
Subject: [PATCH 2/3] Apply comments

---
 evm/src/cpu/docs/memory-handling.md | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/evm/src/cpu/docs/memory-handling.md b/evm/src/cpu/docs/memory-handling.md
index fa37b4dd72..944cc388ef 100644
--- a/evm/src/cpu/docs/memory-handling.md
+++ b/evm/src/cpu/docs/memory-handling.md
@@ -4,10 +4,10 @@ The CPU interacts with the EVM memory via memory channels. At each CPU row, a me
 - 1 `is_read` flag. It indicates if a memory operation is a read or a write.
 - 3 address columns. A memory address is made of three parts: `context`, `segment` and `virtual`.
 - 8 value columns. EVM words are 256 bits long, and they are broken down in 8 32-bit limbs.
-The last memory channel is `lv.partial_channel`: it doesn't have its own column values and shares them with the first memory channel `lv.mem_channels[0]`. This allows use to save eight columns.
+The last memory channel is `current_row.partial_channel`: it doesn't have its own column values and shares them with the first memory channel `current_row.mem_channels[0]`. This allows use to save eight columns.
 
 # Top of the stack
-The majority of memory operations involve the stack. The stack is a segment in memory, and stack operations (popping or pushing) use the memory channels. Every CPU instruction does between 0 and 4 pops, and between 0 and 1 push. However, for efficiency purposes, we hold the top of the stack in `lv.mem_channels[0]`, only writing it in memory if necessary.
+The majority of memory operations involve the stack. The stack is a segment in memory, and stack operations (popping or pushing) use the memory channels. Every CPU instruction performs between 0 and 4 pops, and may push at most once. However, for efficiency purposes, we hold the top of the stack in `current_row.mem_channels[0]`, only writing it in memory if necessary.
 
 ## Motivation
 See https://github.com/0xPolygonZero/plonky2/issues/1149.
@@ -15,17 +15,17 @@ See https://github.com/0xPolygonZero/plonky2/issues/1149.
 ## Top reading and writing
 When a CPU instruction modifies the stack, it must update the top of the stack accordingly. There are three cases.
 
-- **The instruction pops and pushes:** The new top of the stack is calculated by the instruction and stored in `nv.mem_channels[0]`; it may  be a memory read, and the instruction is responsible for setting `nv.mem_channels[0].used` correctly. The previous top of the stack is discarded and doesn't need to be written in memory.
+- **The instruction pops and pushes:** The new top of the stack is stored in `next_row.mem_channels[0]`; it may be computed by the instruction, or it could be read from memory. In either case, the instruction is responsible for setting `next_row.mem_channels[0]`'s flags and address columns correctly. After use, the previous top of the stack is discarded and doesn't need to be written in memory.
 
-- **The instruction pushes, but doesn't pop:** The new top of the stack is calculated by the instruction and stored in `nv.mem_channels[0]`; it may  be a memory read, and the instruction is responsible for setting `nv.mem_channels[0].used` correctly. If the stack wasn't empty (`lv.stack_len > 0`), the current top of the stack is stored with a memory read in `lv.partial_channel`, which shares its values with `lv.mem_channels[0]` (which holds the current top of the stack). If the stack was empty, `lv.partial_channel` is disabled.
+- **The instruction pushes, but doesn't pop:** The new top of the stack is stored in `next_row.mem_channels[0]`; it may be computed by the instruction, or it could be read from memory. In either case, the instruction is responsible for setting `next_row.mem_channels[0]`'s flags and address columns correctly. If the stack wasn't empty (`current_row.stack_len > 0`), the current top of the stack is stored with a memory read in `current_row.partial_channel`, which shares its values with `current_row.mem_channels[0]` (which holds the current top of the stack). If the stack was empty, `current_row.partial_channel` is disabled.
 
-- **The instruction pops, but doesn't push:** The current top of the stack is discarded. If the stack isn't now empty (`lv.stack_len > num_pops`), the new top of the stack is set in `nv.mem_channels[0]` with a memory read from the stack segment. If the stack is now empty, `nv.mem_channels[0]` is disabled.
+- **The instruction pops, but doesn't push:** After use, the current top of the stack is discarded and doesn't need to be written in memory. If the stack isn't now empty (`current_row.stack_len > num_pops`), the new top of the stack is set in `next_row.mem_channels[0]` with a memory read from the stack segment. If the stack is now empty, `next_row.mem_channels[0]` is disabled.
 
-In the last two cases, there is an edge case if `lv.stack_len` is equal to a `special_len`: `0` for a strictly pushing instruction, `num_pops` for a strictly popping instruction. Note that we do not need to check for values below `num_pops`, since this would be a stack underflow exception which is handled separately. We detect this edge case with the compound flag `1 - (lv.stack_len - special_len) * stack_inv_aux`, where `stack_inv_aux` is constrained to be the modular inverse of `(lv.stack_len - special_len)` if it's non-zero, or `0` otherwise. The flag is `1` if `stack_len` is equal to `special_len`, and `0` otherwise.
+In the last two cases, there is an edge case if `current_row.stack_len` is equal to a `special_len`: `0` for a strictly pushing instruction, `num_pops` for a strictly popping instruction. Note that we do not need to check for values below `num_pops`, since this would be a stack underflow exception which is handled separately. We detect this edge case with the compound flag `1 - (current_row.stack_len - special_len) * stack_inv_aux`, where `stack_inv_aux` is constrained to be the modular inverse of `(current_row.stack_len - special_len)` if it's non-zero, or `0` otherwise. The flag is `1` if `stack_len` is equal to `special_len`, and `0` otherwise.
 
 This logic can be found in code in the `eval_packed_one` function of `stack.rs`, which multiplies all of the constraints with a degree 1 filter passed as argument.
 
 ## Operation flag merging
-To reduce the total number of columns, many operation flags are merged together (for example DUP and SWAP). If the two instructions have different stack behaviors, this can be a problem: the filter for an individual operation is now of degree 2 (e.g. `lv.op.dup_swap * lv.opcode_bits[4]` for SWAP) and some constraints of `eval_packed_one` are already of degree 3.
+To reduce the total number of columns, many operation flags are merged together (for example DUP and SWAP). If the two instructions have different stack behaviors, this can be a problem: the filter for an individual operation is now of degree 2 (e.g. `current_row.op.dup_swap * current_row.opcode_bits[4]` for SWAP) and some constraints of `eval_packed_one` are already of degree 3.
 
 When this happens, stack constraints are defined manually in the operation's dedicated file (e.g. `dup_swap.rs`). Implementation details vary and can be found in the files; in general they make use of the `stack_inv_aux` and `stack_inv_aux_2` columns to lower the degree of the constraints.

From c51dfa931c4536bfd3a181dbfbf1e237f43ae9a5 Mon Sep 17 00:00:00 2001
From: Hamy Ratoanina <hamy.ratoanina@toposware.com>
Date: Fri, 10 Nov 2023 17:14:35 -0500
Subject: [PATCH 3/3] Apply comments

---
 evm/src/cpu/docs/memory-handling.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/evm/src/cpu/docs/memory-handling.md b/evm/src/cpu/docs/memory-handling.md
index 944cc388ef..eb7b3f6b90 100644
--- a/evm/src/cpu/docs/memory-handling.md
+++ b/evm/src/cpu/docs/memory-handling.md
@@ -21,11 +21,12 @@ When a CPU instruction modifies the stack, it must update the top of the stack a
 
 - **The instruction pops, but doesn't push:** After use, the current top of the stack is discarded and doesn't need to be written in memory. If the stack isn't now empty (`current_row.stack_len > num_pops`), the new top of the stack is set in `next_row.mem_channels[0]` with a memory read from the stack segment. If the stack is now empty, `next_row.mem_channels[0]` is disabled.
 
-In the last two cases, there is an edge case if `current_row.stack_len` is equal to a `special_len`: `0` for a strictly pushing instruction, `num_pops` for a strictly popping instruction. Note that we do not need to check for values below `num_pops`, since this would be a stack underflow exception which is handled separately. We detect this edge case with the compound flag `1 - (current_row.stack_len - special_len) * stack_inv_aux`, where `stack_inv_aux` is constrained to be the modular inverse of `(current_row.stack_len - special_len)` if it's non-zero, or `0` otherwise. The flag is `1` if `stack_len` is equal to `special_len`, and `0` otherwise.
+In the last two cases, there is an edge case if `current_row.stack_len` is equal to a `special_len`. For a strictly pushing instruction, this happens if the stack is empty, and `special_len = 0`. For a strictly popping instruction, this happens if the next stack is empty, i.e. that all remaining elements are popped, and `special_len = num_pops`. Note that we do not need to check for values below `num_pops`, since this would be a stack underflow exception which is handled separately.
+The edge case is detected with the compound flag `1 - not_special_len * stack_inv_aux`, where `not_special_len = current_row - special_len` and `stack_inv_aux` is constrained to be the modular inverse of `is_not_special_len` if it's non-zero, or `0` otherwise. The flag is `1` if `stack_len` is equal to `special_len`, and `0` otherwise.
 
 This logic can be found in code in the `eval_packed_one` function of `stack.rs`, which multiplies all of the constraints with a degree 1 filter passed as argument.
 
 ## Operation flag merging
-To reduce the total number of columns, many operation flags are merged together (for example DUP and SWAP). If the two instructions have different stack behaviors, this can be a problem: the filter for an individual operation is now of degree 2 (e.g. `current_row.op.dup_swap * current_row.opcode_bits[4]` for SWAP) and some constraints of `eval_packed_one` are already of degree 3.
+To reduce the total number of columns, many operation flags are merged together (e.g. DUP and SWAP) and are distinguished with the binary decomposition of their opcodes. The filter for a merged operation is now of degree 2: for example, `is_swap = current_row.op.dup_swap * current_row.opcode_bits[4]` since the 4th bit is set to 1 for a SWAP and 0 for a DUP. If the two instructions have different stack behaviors, this can be a problem: `eval_packed_one`'s degrees are already of degree 3 and it can't support degree 2 filters.
 
-When this happens, stack constraints are defined manually in the operation's dedicated file (e.g. `dup_swap.rs`). Implementation details vary and can be found in the files; in general they make use of the `stack_inv_aux` and `stack_inv_aux_2` columns to lower the degree of the constraints.
+When this happens, stack constraints are defined manually in the operation's dedicated file (e.g. `dup_swap.rs`). Implementation details vary case-by-case and can be found in the files.