-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration test #368
Integration test #368
Conversation
85fcacd
to
6e3799c
Compare
8c31993
to
80a1cd4
Compare
b3648c9
to
447001f
Compare
aed3b11
to
9dd45db
Compare
We need to resolve #245 to support writes to X0. |
b1078df
to
9211c7a
Compare
9211c7a
to
e52dd42
Compare
|
Fixed a bug: missing padding in the program table witness. That seems unrelated but it was just randomly triggered by changes in this PR. @kunxian-xia Ready again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new changes look good to me!
assert!(DVRAM::max_len().is_power_of_two()); | ||
let mut final_table = | ||
RowMajorMatrix::<F>::new(final_mem.len().next_power_of_two(), num_witness); | ||
let mut final_table = RowMajorMatrix::<F>::new(DVRAM::max_len(), num_witness); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @naure , could you elaborate in new design how we achieve non-uniform overhead? previously the allocated memory in RowMajorMatrix
are dynamic and cover final_ram
which I assumed final ram size can be dynamic according to memory access "per proof". However It's not obvious to me for how it work in new design
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type is not used in the example program, so this change was maybe not necessary. We can revert it if that’s incorrect.
@kunxian-xia Which version do you think is right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new version is incorrect to me. Original design is the key of our non-uniform memory overhead for RAM holding heap/stack.
Originally I thought the design was kept in some other places so I bring up this question.
(I was also a bit surprise how this type is not used in Fabonacci program 😮, will check detail for how Fabonacci works in this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was some minor bug in it but that was maybe not the right fix. Let me revert / fix this in a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fibonacci program uses only a few predictable static addresses. So it works perfectly with NonVolatileRamCircuit
and its addr: Fixed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert: #601
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the old design is wrong as it should not create the final_table based on the length of final_mem
. For example, if the addresses in final_mem
are vec![4, 16, 20, 128, 168]
, then the old assignment will create a final_table that only has 4 elements, while the correct final_table should has 168.next_power_of_two() = 256
elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old design:
final_mem
can start on arbitrary offset addressfinal_mem
are in dense and last is the max_access address
For example, if offset address is 0x12340000
, and max access is 0x12340018
then address final_mem should be [0x12340000, 0x12340004, 0x12340008, 0x1234000c, 0x12340010, 0x12340014, 0x12340018]
. In this case, final table size will be 7.next_power_of_two()
= 8.
@kunxian-xia does the explanation make sense to you :) ?
Previously I didnt have clear documents on that so it cause misunderstanding, and at that moment emul mem design not settle yet, sorry for that
Will review on #601 and let settle the design there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the above explanation makes sense to me. The main issue we have is we need to insert untouched memory records into final_mem
manually. As in your example, if the real touched addresses are vec![0x12340000, 0x12340008, 0x12340018]
, then we need to insert vec![0x12340004, 0x1234000c, 0x12340010, 0x12340014]
into final_mem manually, is this what you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issue we have is we need to insert untouched memory
yes exactly, as I previously explanation we should be able to resolve unconsecutive mem access in imperfect but good enough version.
.map(|rs2| rs2.value); | ||
|
||
let final_access = vm.tracer().final_accesses(); | ||
let end_cycle: u32 = vm.tracer().cycle().try_into().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If PublicValues
needs a u32
for end_cycle, then fn cycle
should probably return a u32
?
.assign_table_circuit::<ExampleProgramTableCircuit<E>>(&zkvm_cs, &prog_config, vm.program()) | ||
.unwrap(); | ||
|
||
if std::env::var("MOCK_PROVING").is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MOCK_PROVING=no ./run_my_programming
would also result in this branch being taken? This might be a bit suprising.
- Revert #368 for `DynVolatileRamTableConfig`. - Add padding of `final_cycle`. --------- Co-authored-by: Aurélien Nicolas <[email protected]>
Summary
This PR aims to add an integration test in which we proves the
fibonacci
rust program. The goal will be achieved by two steps:This PR currently builds on top of the changes to
ceno_emul
in #487.naure:
Summary of changes
MmuConfig
(Memory Management Unit). Meanwhile,Rv32imConfig
focuses on instruction circuits.NonVolatileRamCircuit
) more flexible.MemPadder
.SetTableSpec
to reflect that the address parameters are not relevant in the caseFixedAddr
.Performance
By running the command
cargo run --example fibonacci_elf --release
, we get the e2e proving time is 48.5s 1 for 11538921 execution steps. This is equivalent to 237.9 kHz.Footnotes
Configuration of the benchmark machine: AMD EPYC 9R14 (32 threads) + 64GB memory ↩