Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(benchmark): add baby bear poseidon2 benchmark #519

Merged
merged 4 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Cargo.Bazel.lock
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"checksum": "d9a69b7b7c2e8f7447076811098173ebe669c22af14223e2e9dfeb933bf91d26",
"checksum": "5d4563b52a4fb67e240f27d8b902c760a53d92fd0b9afc3a6e6c6b6f33bb5236",
"crates": {
"addchain 0.2.0": {
"name": "addchain",
Expand Down Expand Up @@ -13491,6 +13491,10 @@
"id": "ff 0.13.0",
"target": "ff"
},
{
"id": "p3-baby-bear 0.1.3-succinct",
"target": "p3_baby_bear"
},
{
"id": "p3-bn254-fr 0.1.3-succinct",
"target": "p3_bn254_fr"
Expand Down
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions benchmark/poseidon2/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,10 @@ tachyon_cc_binary(
"//benchmark/poseidon:simple_poseidon_benchmark_reporter",
"//benchmark/poseidon2/horizen",
"//benchmark/poseidon2/plonky3",
"//tachyon/base/containers:contains",
"//tachyon/c/math/elliptic_curves/bn/bn254:fr",
"//tachyon/c/math/finite_fields/baby_bear",
"//tachyon/math/elliptic_curves/bn/bn254:poseidon2",
"//tachyon/math/finite_fields/baby_bear:poseidon2",
],
)
152 changes: 122 additions & 30 deletions benchmark/poseidon2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,42 +15,134 @@ CPU Caches:
L2 Unified 4096 KiB (x12)
```

Note that Poseidon2 runs 100x per test due to some time results being too small when running a single iteration.

batzor marked this conversation as resolved.
Show resolved Hide resolved
## BN254

```shell
bazel run -c opt --//:has_openmp --//:has_rtti --//:has_matplotlib //benchmark/poseidon2:poseidon2_benchmark -- -p bn254_fr --vendor horizen --vendor plonky3 --check_results
```

### On Intel i9-13900K

| Trial Number | Tachyon | Horizen | Plonky3 |
| :----------- | --------- | ------------- | --------- |
| 0 | 0.000788 | **0.000534** | 0.000876 |
| 1 | 0.000628 | **0.000585** | 0.00087 |
| 2 | 0.000624 | **0.000517** | 0.000865 |
| 3 | 0.000622 | **0.000513** | 0.000866 |
| 4 | 0.000634 | **0.000603** | 0.000861 |
| 5 | 0.000628 | **0.000512** | 0.001002 |
| 6 | 0.000618 | **0.00051** | 0.000853 |
| 7 | 0.000616 | **0.000553** | 0.000852 |
| 8 | 0.0007 | **0.000693** | 0.000873 |
| 9 | 0.000614 | **0.000525** | 0.000937 |
| avg | 0.0006472 | **0.0005545** | 0.0008855 |

![image](/benchmark/poseidon2/poseidon2_benchmark_bn254_ubuntu_i9.png)

### On Mac M3 Pro

| Trial Number | Tachyon | Horizen | Plonky3 |
| :----------- | --------- | ------------- | --------- |
| 0 | 0.001053 | **0.000816** | 0.001186 |
| 1 | 0.001033 | **0.00076** | 0.001177 |
| 2 | 0.001019 | **0.000726** | 0.001157 |
| 3 | 0.001012 | **0.000712** | 0.001172 |
| 4 | 0.001007 | **0.000691** | 0.001152 |
| 5 | 0.001023 | **0.000684** | 0.001131 |
| 6 | 0.001051 | **0.000682** | 0.001123 |
| 7 | 0.001005 | **0.000678** | 0.001116 |
| 8 | 0.000996 | **0.000687** | 0.001118 |
| 9 | 0.001003 | **0.00068** | 0.001127 |
| avg | 0.0010202 | **0.0007116** | 0.0011459 |

![image](/benchmark/poseidon2/poseidon2_benchmark_bn254_mac_m3.png)

## Baby Bear

Note: Horizen and Plonky3 compute values with a different internal matrix, requiring them to be compared with Tachyon separately.

### Horizen

```shell
bazel run -c opt --//:has_openmp --//:has_rtti --//:has_matplotlib //benchmark/poseidon2:poseidon2_benchmark -- -p baby_bear --vendor horizen --check_results
```

#### On Intel i9-13900K

| Repetition | Tachyon | Horizen | Plonky3 |
| :--------- | ------- | ----------- | ------- |
| 0 | 8e-06 | **7e-06** | 1e-05 |
| 1 | 7e-06 | **5e-06** | 8e-06 |
| 2 | 5e-06 | **4e-06** | 8e-06 |
| 3 | 6e-06 | **4e-06** | 8e-06 |
| 4 | 5e-06 | **3e-06** | 7e-06 |
| 5 | 6e-06 | **3e-06** | 7e-06 |
| 6 | 5e-06 | **3e-06** | 7e-06 |
| 7 | 6e-06 | **3e-06** | 7e-06 |
| 8 | 5e-06 | **4e-06** | 7e-06 |
| 9 | 5e-06 | **3e-06** | 7e-06 |
| avg | 5.8e-06 | **3.9e-06** | 7.6e-06 |

![image](/benchmark/poseidon2/poseidon2_benchmark_ubuntu_i9.png)
| Trial Number | Tachyon | Horizen |
| :----------- | ------------- | --------- |
| 0 | **0.000127** | 0.000381 |
| 1 | **0.000126** | 0.00036 |
| 2 | **0.000125** | 0.00037 |
| 3 | **0.000125** | 0.000356 |
| 4 | **0.000125** | 0.000354 |
| 5 | **0.000125** | 0.000354 |
| 6 | **0.000125** | 0.000354 |
| 7 | **0.000125** | 0.00036 |
| 8 | **0.000125** | 0.000359 |
| 9 | **0.000125** | 0.000353 |
| avg | **0.0001253** | 0.0003601 |

![image](/benchmark/poseidon2/poseidon2_benchmark_baby_bear_horizen_ubuntu_i9.png)

#### On Mac M3 Pro

| Trial Number | Tachyon | Horizen |
| :----------- | ------------- | --------- |
| 0 | **0.000191** | 0.000203 |
| 1 | **0.000191** | 0.0002 |
| 2 | **0.000189** | 0.0002 |
| 3 | **0.000188** | 0.0002 |
| 4 | **0.000194** | 0.000199 |
| 5 | **0.000188** | 0.000199 |
| 6 | **0.000189** | 0.000199 |
| 7 | **0.000189** | 0.000199 |
| 8 | **0.000188** | 0.0002 |
| 9 | **0.000188** | 0.000199 |
| avg | **0.0001895** | 0.0001998 |

![image](/benchmark/poseidon2/poseidon2_benchmark_baby_bear_horizen_mac_m3.png)

### Plonky3

```shell
bazel run -c opt --//:has_openmp --//:has_rtti --//:has_matplotlib //benchmark/poseidon2:poseidon2_benchmark -- -p baby_bear --vendor plonky3 --check_results
```

#### On Intel i9-13900K

| Trial Number | Tachyon | Plonky3 |
| :----------- | --------- | ------------ |
| 0 | 0.000112 | **6.6e-05** |
| 1 | 0.000111 | **6.5e-05** |
| 2 | 0.000111 | **6.6e-05** |
| 3 | 0.000111 | **6.6e-05** |
| 4 | 0.00011 | **6.6e-05** |
| 5 | 0.000116 | **6.6e-05** |
| 6 | 0.00011 | **6.5e-05** |
| 7 | 0.000109 | **6.6e-05** |
| 8 | 0.00011 | **6.6e-05** |
| 9 | 0.000109 | **6.5e-05** |
| avg | 0.0001109 | **6.57e-05** |

![image](/benchmark/poseidon2/poseidon2_benchmark_baby_bear_plonky3_ubuntu_i9.png)

#### On Mac M3 Pro

| Repetition | Tachyon | Horizen | Plonky3 |
| :--------- | ------- | ----------- | -------- |
| 0 | 1.3e-05 | **1.2e-05** | 1.5e-05 |
| 1 | 1e-05 | **8e-06** | 1.1e-05 |
| 2 | 9e-06 | **7e-06** | 1e-05 |
| 3 | 9e-06 | **7e-06** | 1e-05 |
| 4 | 9e-06 | **7e-06** | 1e-05 |
| 5 | 9e-06 | **7e-06** | 1e-05 |
| 6 | 9e-06 | **7e-06** | 1e-05 |
| 7 | 9e-06 | **7e-06** | 1e-05 |
| 8 | 9e-06 | **7e-06** | 1e-05 |
| 9 | 9e-06 | **7e-06** | 1e-05 |
| avg | 9.5e-06 | **7.6e-06** | 1.06e-05 |

![image](/benchmark/poseidon2/poseidon2_benchmark_mac_m3.png)
| Trial Number | Tachyon | Plonky3 |
| :----------- | --------- | ------------- |
| 0 | 0.000169 | **0.000106** |
| 1 | 0.000167 | **0.000105** |
| 2 | 0.000166 | **0.000105** |
| 3 | 0.000169 | **0.000105** |
| 4 | 0.000167 | **0.000105** |
| 5 | 0.00017 | **0.000105** |
| 6 | 0.000168 | **0.000105** |
| 7 | 0.000167 | **0.000105** |
| 8 | 0.000168 | **0.000105** |
| 9 | 0.000168 | **0.000105** |
| avg | 0.0001679 | **0.0001051** |

![image](/benchmark/poseidon2/poseidon2_benchmark_baby_bear_plonky3_mac_m3.png)****
40 changes: 29 additions & 11 deletions benchmark/poseidon2/horizen/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,22 +1,40 @@
use std::time::Instant;
use tachyon_rs::math::elliptic_curves::bn::bn254::Fr as CppFr;
use std::{sync::Arc, time::Instant};
use tachyon_rs::math::{
elliptic_curves::bn::bn254::Fr as CppBn254Fr, finite_fields::baby_bear::BabyBear as CppBabyBear,
};
use zkhash::{
fields::bn256::FpBN256,
poseidon2::{poseidon2::Poseidon2, poseidon2_instance_bn256::POSEIDON2_BN256_PARAMS},
ark_ff::PrimeField,
poseidon2::{
poseidon2::Poseidon2, poseidon2_instance_babybear::POSEIDON2_BABYBEAR_16_PARAMS,
poseidon2_instance_bn256::POSEIDON2_BN256_PARAMS, poseidon2_params::Poseidon2Params,
},
};

#[no_mangle]
pub extern "C" fn run_poseidon_horizen_bn254_fr(duration: *mut u64) -> *mut CppFr {
let poseidon = Poseidon2::new(&POSEIDON2_BN256_PARAMS);
fn run_poseidon2<F: PrimeField + std::convert::From<i32>, R>(
duration: *mut u64,
params: &Arc<Poseidon2Params<F>>,
) -> *mut R {
let poseidon2 = Poseidon2::new(params);

let t = poseidon.get_t();
let input: Vec<FpBN256> = (0..t).map(|_i| FpBN256::from(0)).collect();
let t = poseidon2.get_t();
let mut input: Vec<F> = (0..t).map(|_| F::from(0)).collect();

let start = Instant::now();
let state = poseidon.permutation(&input);
for _ in 0..100 {
input = poseidon2.permutation(&input);
}
unsafe {
duration.write(start.elapsed().as_micros() as u64);
}
Box::into_raw(Box::new(input[1] as F)) as *mut R
}

Box::into_raw(Box::new(state[1])) as *mut CppFr
#[no_mangle]
pub extern "C" fn run_poseidon2_horizen_baby_bear(duration: *mut u64) -> *mut CppBabyBear {
run_poseidon2::<_, CppBabyBear>(duration, &POSEIDON2_BABYBEAR_16_PARAMS)
}

#[no_mangle]
pub extern "C" fn run_poseidon2_horizen_bn254_fr(duration: *mut u64) -> *mut CppBn254Fr {
run_poseidon2::<_, CppBn254Fr>(duration, &POSEIDON2_BN256_PARAMS)
}
1 change: 1 addition & 0 deletions benchmark/poseidon2/plonky3/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ publish = false

[dependencies]
ff = { version = "0.13", features = ["derive", "derive_bits"] }
p3-baby-bear = "0.1.3-succinct"
p3-bn254-fr = "0.1.3-succinct"
p3-field = "0.1.3-succinct"
p3-poseidon2 = "0.1.3-succinct"
Expand Down
91 changes: 65 additions & 26 deletions benchmark/poseidon2/plonky3/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,19 @@
use core::fmt;
use ff::PrimeField;
use p3_baby_bear::{BabyBear, DiffusionMatrixBabyBear};
use p3_bn254_fr::{Bn254Fr, DiffusionMatrixBN254, FFBn254Fr};
use p3_field::AbstractField;
use p3_poseidon2::{Poseidon2, Poseidon2ExternalMatrixHL};
use p3_poseidon2::{DiffusionPermutation, Poseidon2, Poseidon2ExternalMatrixHL};
use p3_symmetric::Permutation;
use std::time::Instant;
use tachyon_rs::math::elliptic_curves::bn::bn254::Fr as CppFr;
use zkhash::ark_ff::{BigInteger, PrimeField as ark_PrimeField};
use zkhash::fields::bn256::FpBN256 as ark_FpBN256;
use zkhash::poseidon2::poseidon2_instance_bn256::RC3;
use tachyon_rs::math::{
elliptic_curves::bn::bn254::Fr as CppBn254Fr, finite_fields::baby_bear::BabyBear as CppBabyBear,
};
use zkhash::ark_ff::{BigInteger, Field, PrimeField as ark_PrimeField};
use zkhash::fields::{babybear::FpBabyBear as ark_FpBabyBear, bn256::FpBN256 as ark_FpBN256};
use zkhash::poseidon2::{
poseidon2_instance_babybear::RC16 as BabyBearRC16, poseidon2_instance_bn256::RC3 as BN256RC3,
};

fn bn254_from_ark_ff(input: ark_FpBN256) -> Bn254Fr {
let bytes = input.into_bigint().to_bytes_le();
Expand All @@ -29,20 +35,32 @@ fn bn254_from_ark_ff(input: ark_FpBN256) -> Bn254Fr {
}
}

#[no_mangle]
pub extern "C" fn run_poseidon_plonky3_bn254_fr(duration: *mut u64) -> *mut CppFr {
const WIDTH: usize = 3;
const D: u64 = 5;
const ROUNDS_F: usize = 8;
const ROUNDS_P: usize = 56;
fn baby_bear_from_ark_ff(input: ark_FpBabyBear) -> BabyBear {
BabyBear::from_canonical_u32(input.into_bigint().0[0] as u32)
}

fn run_poseidon2<
const WIDTH: usize,
const D: u64,
const ROUNDS_F: usize,
const ROUNDS_P: usize,
NativeF: p3_field::PrimeField + fmt::Debug,
F: Field,
R,
DiffusionMatrix: DiffusionPermutation<NativeF, WIDTH>,
>(
duration: *mut u64,
rc: &Vec<Vec<F>>,
from_ark_ff: &dyn Fn(F) -> NativeF,
diffusion_matrix: DiffusionMatrix,
) -> *mut R {
// Copy over round constants from zkhash.
let mut round_constants: Vec<[Bn254Fr; WIDTH]> = RC3
let mut round_constants: Vec<[NativeF; WIDTH]> = rc
.iter()
.map(|vec| {
vec.iter()
.cloned()
.map(bn254_from_ark_ff)
.map(from_ark_ff)
.collect::<Vec<_>>()
.try_into()
.unwrap()
Expand All @@ -56,27 +74,48 @@ pub extern "C" fn run_poseidon_plonky3_bn254_fr(duration: *mut u64) -> *mut CppF
.collect::<Vec<_>>();
let external_round_constants = round_constants;

let poseidon =
Poseidon2::<Bn254Fr, Poseidon2ExternalMatrixHL, DiffusionMatrixBN254, WIDTH, D>::new(
ROUNDS_F,
external_round_constants,
Poseidon2ExternalMatrixHL,
ROUNDS_P,
internal_round_constants,
DiffusionMatrixBN254,
);
let poseidon2 = Poseidon2::<NativeF, Poseidon2ExternalMatrixHL, DiffusionMatrix, WIDTH, D>::new(
ROUNDS_F,
external_round_constants,
Poseidon2ExternalMatrixHL,
ROUNDS_P,
internal_round_constants,
diffusion_matrix,
);

let mut input = (0..3)
.map(|_i| Bn254Fr::zero())
let mut input = (0..WIDTH)
.map(|_i| NativeF::zero())
.collect::<Vec<_>>()
.try_into()
.unwrap();

let start = Instant::now();
poseidon.permute_mut(&mut input);
for _ in 0..100 {
poseidon2.permute_mut(&mut input);
}
unsafe {
duration.write(start.elapsed().as_micros() as u64);
}

Box::into_raw(Box::new(input[1])) as *mut CppFr
Box::into_raw(Box::new(input[1])) as *mut R
}

#[no_mangle]
pub extern "C" fn run_poseidon2_plonky3_baby_bear(duration: *mut u64) -> *mut CppBabyBear {
run_poseidon2::<16, 7, 8, 13, BabyBear, ark_FpBabyBear, CppBabyBear, DiffusionMatrixBabyBear>(
duration,
&BabyBearRC16,
&baby_bear_from_ark_ff,
DiffusionMatrixBabyBear,
)
}

#[no_mangle]
pub extern "C" fn run_poseidon2_plonky3_bn254_fr(duration: *mut u64) -> *mut CppBn254Fr {
run_poseidon2::<3, 5, 8, 56, Bn254Fr, ark_FpBN256, CppBn254Fr, DiffusionMatrixBN254>(
duration,
&BN256RC3,
&bn254_from_ark_ff,
DiffusionMatrixBN254,
)
}
Loading