Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump moc to 0.10.0 #78

Merged
merged 3 commits into from
Sep 14, 2023
Merged

bump moc to 0.10.0 #78

merged 3 commits into from
Sep 14, 2023

Conversation

chenyan-dfinity
Copy link
Contributor

No description provided.

@github-actions
Copy link

Note
Diffing the performance result against the published result from main branch.
Unchanged benchmarks are omitted.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 138_275 ($\textcolor{red}{3.32\%}$) 6_974_058_129 ($\textcolor{red}{0.20\%}$) 61_987_732 288_202 ($\textcolor{red}{0.25\%}$) 5_527_868_856 ($\textcolor{red}{0.22\%}$) 309_728 ($\textcolor{red}{0.24\%}$)
triemap 139_765 ($\textcolor{red}{3.29\%}$) 11_432_083_637 ($\textcolor{red}{0.01\%}$) 74_216_052 222_825 ($\textcolor{red}{0.03\%}$) 547_701 ($\textcolor{red}{0.01\%}$) 539_052 ($\textcolor{red}{0.01\%}$)
rbtree 140_562 ($\textcolor{red}{3.27\%}$) 5_979_229_508 ($\textcolor{green}{-0.00\%}$) 57_995_940 88_905 ($\textcolor{red}{0.01\%}$) 268_573 ($\textcolor{red}{0.00\%}$) 278_352 ($\textcolor{red}{0.01\%}$)
splay 136_342 ($\textcolor{red}{3.39\%}$) 11_568_250_621 ($\textcolor{red}{0.00\%}$) 53_995_876 551_926 ($\textcolor{red}{0.00\%}$) 581_651 ($\textcolor{green}{-0.00\%}$) 810_220 ($\textcolor{red}{0.00\%}$)
btree 181_449 ($\textcolor{red}{2.83\%}$) 8_224_241_444 ($\textcolor{green}{-0.00\%}$) 31_103_892 277_542 ($\textcolor{red}{0.00\%}$) 384_171 ($\textcolor{red}{0.00\%}$) 429_041 ($\textcolor{red}{0.00\%}$)
zhenya_hashmap 146_485 ($\textcolor{red}{3.37\%}$) 2_634_116_314 ($\textcolor{red}{0.04\%}$) 65_987_480 65_396 ($\textcolor{red}{0.09\%}$) 80_204 ($\textcolor{red}{0.06\%}$) 94_825 ($\textcolor{red}{0.07\%}$)
btreemap_rs 420_066 ($\textcolor{red}{1.59\%}$) 1_654_114_188 ($\textcolor{red}{0.27\%}$) 13_762_560 66_890 ($\textcolor{red}{0.11\%}$) 112_562 ($\textcolor{red}{0.27\%}$) 81_308 ($\textcolor{red}{0.06\%}$)
imrc_hashmap_rs 419_753 ($\textcolor{red}{1.49\%}$) 2_386_381_104 ($\textcolor{red}{0.03\%}$) 122_454_016 32_903 ($\textcolor{red}{0.17\%}$) 162_822 ($\textcolor{red}{0.07\%}$) 98_526 ($\textcolor{red}{0.03\%}$)
hashmap_rs 413_537 ($\textcolor{red}{1.83\%}$) 402_296_850 ($\textcolor{red}{2.47\%}$) 36_536_320 16_697 ($\textcolor{red}{1.21\%}$) 21_601 ($\textcolor{red}{3.54\%}$) 20_052 ($\textcolor{red}{0.40\%}$)

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 132_227 ($\textcolor{red}{3.51\%}$) 4_684_517_324 ($\textcolor{green}{-0.00\%}$) 29_995_836 511_499 ($\textcolor{red}{0.00\%}$) 186_465 ($\textcolor{red}{0.00\%}$)
heap_rs 411_165 ($\textcolor{red}{1.79\%}$) 123_102_416 ($\textcolor{green}{-0.00\%}$) 9_109_504 53_382 ($\textcolor{red}{0.12\%}$) 18_202 ($\textcolor{red}{0.35\%}$)

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 139_908 ($\textcolor{red}{3.28\%}$) 2_082_623 ($\textcolor{red}{0.00\%}$) 65_508 73_092 ($\textcolor{red}{0.01\%}$) 671_517 ($\textcolor{red}{0.00\%}$) 127_592 ($\textcolor{red}{0.00\%}$)
vector 138_344 ($\textcolor{red}{3.32\%}$) 1_728_571 ($\textcolor{red}{0.00\%}$) 24_764 121_219 ($\textcolor{red}{0.00\%}$) 163_947 ($\textcolor{red}{0.00\%}$) 161_609 ($\textcolor{red}{0.00\%}$)
vec_rs 409_961 ($\textcolor{red}{1.81\%}$) 265_856 ($\textcolor{green}{-0.02\%}$) 655_360 12_902 ($\textcolor{red}{0.61\%}$) 25_331 ($\textcolor{red}{0.31\%}$) 21_215 ($\textcolor{red}{0.95\%}$)

Statistics

  • binary_size: 2.72% [2.34%, 3.10%]
  • max_mem: no change
  • cycles: 0.23% [0.09%, 0.36%]

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 174_597 ($\textcolor{red}{2.64\%}$) 264_156_360 ($\textcolor{red}{0.00\%}$) 235_099_580 ($\textcolor{red}{0.00\%}$) 35_153 ($\textcolor{red}{0.03\%}$) 23_262 ($\textcolor{red}{0.05\%}$)
Rust 498_272 ($\textcolor{red}{1.51\%}$) 82_512_024 ($\textcolor{green}{-0.00\%}$) 56_525_977 ($\textcolor{green}{-0.00\%}$) 42_551 ($\textcolor{red}{0.36\%}$) 44_574 ($\textcolor{red}{7.16\%}$)

Certified map

binary_size generate 10k max mem inc witness
Motoko 166_880 ($\textcolor{red}{2.75\%}$) 18_581_618_617 ($\textcolor{red}{0.01\%}$) 3_429_924 2_209_513 ($\textcolor{red}{0.01\%}$) 327_767 ($\textcolor{red}{0.00\%}$)
Rust 441_794 ($\textcolor{red}{1.83\%}$) 6_202_163_062 ($\textcolor{green}{-0.07\%}$) 1_081_344 983_904 ($\textcolor{green}{-0.09\%}$) 288_469 ($\textcolor{green}{-0.13\%}$)

Statistics

  • binary_size: 2.18% [1.47%, 2.90%]
  • max_mem: no change
  • cycles: 0.52% [-0.38%, 1.43%]

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 230_242 ($\textcolor{red}{1.96\%}$) 37_590 ($\textcolor{red}{0.26\%}$) 16_248 ($\textcolor{green}{-0.42\%}$) 12_674 ($\textcolor{red}{0.16\%}$) 14_137 ($\textcolor{red}{0.23\%}$)
Rust 718_379 ($\textcolor{red}{1.91\%}$) 472_438 ($\textcolor{red}{0.11\%}$) 86_786 ($\textcolor{red}{0.30\%}$) 105_263 ($\textcolor{red}{0.62\%}$) 116_229 ($\textcolor{red}{0.40\%}$)

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 188_321 ($\textcolor{red}{2.41\%}$) 12_267 ($\textcolor{red}{0.71\%}$) 22_357 ($\textcolor{red}{0.17\%}$) 4_729 ($\textcolor{red}{0.40\%}$)
Rust 778_280 ($\textcolor{red}{1.51\%}$) 125_293 ($\textcolor{red}{0.21\%}$) 325_017 ($\textcolor{red}{0.16\%}$) 77_500 ($\textcolor{red}{0.50\%}$)

Statistics

  • binary_size: 1.95% [1.51%, 2.39%]
  • max_mem: no change
  • cycles: 0.27% [0.15%, 0.40%]

Heartbeat

binary_size heartbeat
Motoko 123_357 ($\textcolor{red}{3.74\%}$) 3_758 ($\textcolor{green}{-49.16\%}$)
Rust 23_625 ($\textcolor{green}{-0.31\%}$) 469 ($\textcolor{green}{-40.71\%}$)

Timer

binary_size setTimer cancelTimer
Motoko 129_636 ($\textcolor{red}{3.57\%}$) 15_227 ($\textcolor{red}{0.12\%}$) 1_684 ($\textcolor{red}{0.30\%}$)
Rust 443_367 ($\textcolor{red}{1.96\%}$) 43_417 ($\textcolor{green}{-0.28\%}$) 7_497 ($\textcolor{green}{-2.42\%}$)

Statistics

  • binary_size: 2.76% [-2.32%, 7.85%]
  • max_mem: no change
  • cycles: -0.57% [-2.05%, 0.91%]

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_012_258_537 ($\textcolor{red}{0.00\%}$) 59_396_776 50 50 50
copying 1_012_258_487 ($\textcolor{red}{0.00\%}$) 59_396_776 1_012_236_046 ($\textcolor{red}{0.00\%}$) 1_012_303_056 ($\textcolor{red}{0.00\%}$) 1_012_240_283 ($\textcolor{red}{0.00\%}$)
compacting 1_675_009_925 ($\textcolor{red}{0.00\%}$) 59_396_776 1_292_955_500 ($\textcolor{red}{0.00\%}$) 1_532_273_641 ($\textcolor{red}{0.00\%}$) 1_558_502_986 ($\textcolor{red}{0.00\%}$)
generational 2_498_146_508 ($\textcolor{green}{-0.75\%}$) 59_405_240 977_578_983 ($\textcolor{red}{0.00\%}$) 1_044_991 ($\textcolor{green}{-0.74\%}$) 960_405 ($\textcolor{green}{-0.72\%}$)
incremental 32_320_754 ($\textcolor{red}{0.00\%}$) 1_136_155_048 ($\textcolor{red}{0.00\%}$) 290_257_785 292_951_006 292_977_552

Actor class

binary size put new bucket put existing bucket get
Map 261_606 ($\textcolor{red}{2.96\%}$) 656_047 ($\textcolor{red}{2.73\%}$) 4_459 ($\textcolor{red}{0.22\%}$) 4_919 ($\textcolor{red}{0.20\%}$)

Statistics

  • binary_size: no change
  • max_mem: 0.00%
  • cycles: 0.22% [-0.19%, 0.63%]

Publisher & Subscriber

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 144_439 ($\textcolor{red}{3.25\%}$) 131_299 ($\textcolor{red}{3.53\%}$) 14_651 ($\textcolor{red}{0.07\%}$) 8_456 ($\textcolor{red}{0.06\%}$) 10_539 ($\textcolor{red}{0.09\%}$) 3_669 ($\textcolor{red}{0.19\%}$)
Rust 479_599 ($\textcolor{red}{1.58\%}$) 529_662 ($\textcolor{red}{1.87\%}$) 51_729 ($\textcolor{red}{0.27\%}$) 34_594 ($\textcolor{green}{-0.19\%}$) 74_841 ($\textcolor{red}{0.91\%}$) 44_359 ($\textcolor{red}{6.59\%}$)

Statistics

  • binary_size: 2.56% [1.41%, 3.70%]
  • max_mem: no change
  • cycles: 1.00% [-0.53%, 2.53%]

Overall Statistics

  • binary_size: 2.51% [2.26%, 2.77%]
  • max_mem: 0.00%
  • cycles: 0.29% [0.12%, 0.47%]

@github-actions
Copy link

Note
The flamegraph link only works after you merge.
Unchanged benchmarks are omitted.

Collection libraries

Measure different collection libraries written in both Motoko and Rust.
The library names with _rs suffix are written in Rust; the rest are written in Motoko.

We use the same random number generator with fixed seed to ensure that all collections contain
the same elements, and the queries are exactly the same. Below we explain the measurements of each column in the table:

  • generate 1m. Insert 1m Nat64 integers into the collection. For Motoko collections, it usually triggers the GC; the rest of the column are not likely to trigger GC.
  • max mem. For Motoko, it reports rts_max_heap_size after generate call; For Rust, it reports the Wasm's memory page * 32Kb.
  • batch_get 50. Find 50 elements from the collection.
  • batch_put 50. Insert 50 elements to the collection.
  • batch_remove 50. Remove 50 elements from the collection.

💎 Takeaways

  • The platform only charges for instruction count. Data structures which make use of caching and locality have no impact on the cost.
  • We have a limit on the maximal cycles per round. This means asymptotic behavior doesn't matter much. We care more about the performance up to a fixed N. In the extreme cases, you may see an $O(10000 n\log n)$ algorithm hitting the limit, while an $O(n^2)$ algorithm runs just fine.
  • Amortized algorithms/GC may need to be more eager to avoid hitting the cycle limit on a particular round.
  • Rust costs more cycles to process complicated Candid data, but it is more efficient in performing core computations.

Note

  • The Candid interface of the benchmark is minimal, therefore the serialization cost is negligible in this measurement.
  • Due to the instrumentation overhead and cycle limit, we cannot profile computations with large collections. Hopefully, when deterministic time slicing is ready, we can measure the performance on larger memory footprint.
  • hashmap uses amortized data structure. When the initial capacity is reached, it has to copy the whole array, thus the cost of batch_put 50 is much higher than other data structures.
  • btree comes from mops.one/stableheapbtreemap.
  • zhenya_hashmap comes from mops.one/map.
  • vector comes from mops.one/vector. Compare with buffer, put has better worst case time and space complexity ($O(\sqrt{n})$ vs $O(n)$); get has a slightly larger constant overhead.
  • hashmap_rs uses the fxhash crate, which is the same as std::collections::HashMap, but with a deterministic hasher. This ensures reproducible result.
  • imrc_hashmap_rs uses the im-rc crate, which is the immutable version hashmap in Rust.

Map

binary_size generate 1m max mem batch_get 50 batch_put 50 batch_remove 50
hashmap 138_275 6_974_058_129 61_987_732 288_202 5_527_868_856 309_728
triemap 139_765 11_432_083_637 74_216_052 222_825 547_701 539_052
rbtree 140_562 5_979_229_508 57_995_940 88_905 268_573 278_352
splay 136_342 11_568_250_621 53_995_876 551_926 581_651 810_220
btree 181_449 8_224_241_444 31_103_892 277_542 384_171 429_041
zhenya_hashmap 146_485 2_634_116_314 65_987_480 65_396 80_204 94_825
btreemap_rs 420_066 1_654_114_188 13_762_560 66_890 112_562 81_308
imrc_hashmap_rs 419_753 2_386_381_104 122_454_016 32_903 162_822 98_526
hashmap_rs 413_537 402_296_850 36_536_320 16_697 21_601 20_052

Priority queue

binary_size heapify 1m max mem pop_min 50 put 50
heap 132_227 4_684_517_324 29_995_836 511_499 186_465
heap_rs 411_165 123_102_416 9_109_504 53_382 18_202

Growable array

binary_size generate 5k max mem batch_get 500 batch_put 500 batch_remove 500
buffer 139_908 2_082_623 65_508 73_092 671_517 127_592
vector 138_344 1_728_571 24_764 121_219 163_947 161_609
vec_rs 409_961 265_856 655_360 12_902 25_331 21_215

Cryptographic libraries

Measure different cryptographic libraries written in both Motoko and Rust.

  • SHA-2 benchmarks
    • SHA-256/SHA-512. Compute the hash of a 1M Wasm binary.
    • account_id. Compute the ledger account id from principal, based on SHA-224.
    • neuron_id. Compute the NNS neuron id from principal, based on SHA-256.
  • Certified map. Merkle Tree for storing key-value pairs and generate witness according to the IC Interface Specification.
    • generate 10k. Insert 10k 7-character word as both key and value into the certified map.
    • max mem. For Motoko, it reports rts_max_heap_size after generate call; For Rust, it reports the Wasm's memory page * 32Kb.
    • inc. Increment a counter and insert the counter value into the map.
    • witness. Generate the root hash and a witness for the counter.

SHA-2

binary_size SHA-256 SHA-512 account_id neuron_id
Motoko 174_597 264_156_360 235_099_580 35_153 23_262
Rust 498_272 82_512_024 56_525_977 42_551 44_574

Certified map

binary_size generate 10k max mem inc witness
Motoko 166_880 18_581_618_617 3_429_924 2_209_513 327_767
Rust 441_794 6_202_163_062 1_081_344 983_904 288_469

Sample Dapps

Measure the performance of some typical dapps:

  • Basic DAO,
    with heartbeat disabled to make profiling easier. We have a separate benchmark to measure heartbeat performance.
  • DIP721 NFT

Note

  • The cost difference is mainly due to the Candid serialization cost.
  • Motoko statically compiles/specializes the serialization code for each method, whereas in Rust, we use serde to dynamically deserialize data based on data on the wire.
  • We could improve the performance on the Rust side by using parser combinators. But it is a challenge to maintain the ergonomics provided by serde.
  • For real-world applications, we tend to send small data for each endpoint, which makes the Candid overhead in Rust tolerable.

Basic DAO

binary_size init transfer_token submit_proposal vote_proposal
Motoko 230_242 37_590 16_248 12_674 14_137
Rust 718_379 472_438 86_786 105_263 116_229

DIP721 NFT

binary_size init mint_token transfer_token
Motoko 188_321 12_267 22_357 4_729
Rust 778_280 125_293 325_017 77_500

Heartbeat / Timer

Measure the cost of empty heartbeat and timer job.

  • setTimer measures both the setTimer(0) method and the execution of empty job.
  • It is not easy to reliably capture the above events in one flamegraph, as the implementation detail
    of the replica can affect how we measure this. Typically, a correct flamegraph contains both setTimer and canister_global_timer function. If it's not there, we may need to adjust the script.

Heartbeat

binary_size heartbeat
Motoko 123_357 3_758
Rust 23_625 469

Timer

binary_size setTimer cancelTimer
Motoko 129_636 15_227 1_684
Rust 443_367 43_417 7_497

Motoko Specific Benchmarks

Measure various features only available in Motoko.

  • Garbage Collection. Measure Motoko garbage collection cost using the Triemap benchmark. The max mem column reports rts_max_heap_size after generate call. The cycle cost numbers reported here are garbage collection cost only. Some flamegraphs are truncated due to the 2M log size limit. The dfx/ic-wasm optimizer is disabled for the garbage collection test cases due to how the optimizer affects function names, making profiling trickier.

    • default. Compile with the default GC option. With the current GC scheduler, generate will trigger the copying GC. The rest of the methods will not trigger GC.
    • copying. Compile with --force-gc --copying-gc.
    • compacting. Compile with --force-gc --compacting-gc.
    • generational. Compile with --force-gc --generational-gc.
    • incremental. Compile with --force-gc --incremental-gc.
  • Actor class. Measure the cost of spawning actor class, using the Actor classes example.

Garbage Collection

generate 800k max mem batch_get 50 batch_put 50 batch_remove 50
default 1_012_258_537 59_396_776 50 50 50
copying 1_012_258_487 59_396_776 1_012_236_046 1_012_303_056 1_012_240_283
compacting 1_675_009_925 59_396_776 1_292_955_500 1_532_273_641 1_558_502_986
generational 2_498_146_508 59_405_240 977_578_983 1_044_991 960_405
incremental 32_320_754 1_136_155_048 290_257_785 292_951_006 292_977_552

Actor class

binary size put new bucket put existing bucket get
Map 261_606 656_047 4_459 4_919

Publisher & Subscriber

Measure the cost of inter-canister calls from the Publisher & Subscriber example.

pub_binary_size sub_binary_size subscribe_caller subscribe_callee publish_caller publish_callee
Motoko 144_439 131_299 14_651 8_456 10_539 3_669
Rust 479_599 529_662 51_729 34_594 74_841 44_359

@chenyan-dfinity chenyan-dfinity merged commit 914bea6 into main Sep 14, 2023
1 check passed
@chenyan-dfinity chenyan-dfinity deleted the bump branch September 14, 2023 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants