Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various variants of "simplified/dummy" executors #15152

Merged
merged 2 commits into from
Nov 23, 2024

Conversation

igor-aptos
Copy link
Contributor

Description

Adding all these executor configurations, for understanding upper bounds on different approaches/optimizations:

    /// Transaction execution: AptosVM
    /// Executing conflicts: in the input order, via BlockSTM,
    /// State: BlockSTM-provided MVHashMap-based view with caching
    AptosVMWithBlockSTM,
    /// Transaction execution: NativeVM - a simplified rust implemtation to create VMChangeSet,
    /// Executing conflicts: in the input order, via BlockSTM
    /// State: BlockSTM-provided MVHashMap-based view with caching
    NativeVMWithBlockSTM,
    /// Transaction execution: AptosVM
    /// Executing conflicts: All transactions execute on the state at the beginning of the block
    /// State: Raw CachedStateView
    AptosVMParallelUncoordinated,
    /// Transaction execution: Native rust code producing WriteSet
    /// Executing conflicts: All transactions execute on the state at the beginning of the block
    /// State: Raw CachedStateView
    NativeParallelUncoordinated,
    /// Transaction execution: Native rust code updating in-memory state, no WriteSet output
    /// Executing conflicts: All transactions execute on the state in the first come - first serve basis
    /// State: In-memory DashMap with rust values of state (i.e. StateKey -> Resource (either Account or FungibleStore)),
    ///        cached across blocks, filled upon first request
    NativeValueCacheParallelUncoordinated,
    /// Transaction execution: Native rust code updating in-memory state, no WriteSet output
    /// Executing conflicts: All transactions execute on the state in the first come - first serve basis
    /// State: In-memory DashMap with AccountAddress to seq_num and balance (ignoring all other fields).
    ///        kept across blocks, randomly initialized on first access, storage ignored.
    NativeNoStorageParallelUncoordinated,

How Has This Been Tested?

Key Areas to Review

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Nov 1, 2024

⏱️ 16h 4m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
single-node-performance 11h 56m 🟥🟥🟥🟥🟥 (+16 more)
test-target-determinator 1h 17m 🟩🟩🟩🟩🟩 (+17 more)
rust-cargo-deny 25m 🟩🟩🟩🟩🟩 (+10 more)
check-dynamic-deps 14m 🟩🟩🟩🟩🟩 (+10 more)
rust-move-tests 10m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟥
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

@igor-aptos igor-aptos force-pushed the igor/native_executor_benchmarks branch 8 times, most recently from a9a56e8 to ef4d774 Compare November 5, 2024 21:09
@igor-aptos igor-aptos force-pushed the igor/merge_block_and_vm_executor branch 2 times, most recently from 66a9235 to fd3a91a Compare November 5, 2024 21:23
},
}

impl NativeTransaction {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my remote sharding experiments, native vm was not able to handle this one txn in the beginning. You have changed the code since then, but it might worthwhile to run an "unsharded execution" in the executor-benchmark

2024-11-16T00:33:08.305178Z [main] INFO storage/aptosdb/src/state_store/mod.rs:451 Initializing BufferedState. {"latest_snapshot_version":null,"num_transactions":0}
2024-11-16T00:33:08.305256Z [main] INFO storage/aptosdb/src/state_store/mod.rs:547 StateStore initialization finished. {"latest_in_memory_root_hash":"5350415253455f4d45524b4c455f504c414345484f4c4445525f484153480000","latest_in_memory_version":null,"latest_snapshot_root_hash":"5350415253455f4d45524b4c455f504c414345484f4c4445525f484153480000","latest_snapshot_version":null}
2024-11-16T00:33:08.305735Z [main] INFO aptos-move/aptos-vm/src/aptos_vm.rs:2557 Executing block, transaction count: 1 {"name":"miscellaneous","txn_idx":0}

@@ -1,4 +1,8 @@
// Copyright (c) Aptos Foundation
// SPDX-License-Identifier: Apache-2.0

pub mod aptos_vm_uncoordinated;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best place to have 'native' module ??
How about having the native module in "aptos-move/aptos-vm/src/block_executor/" ?

Native is used inside of "aptos-move/aptos-vm/src/..." and I suppose it will be used more from 'lower layers'. executor-benchmark seems to be too high level for this. Even in this code, probably need for 'vm_wrapper' comes from native being in executor-benchmark

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vm_wrapper is one line, so the tradeoff seems right

we don't want to use it in prod, only for the benchmark. we can create a separate native project if needed later, but this sounds reasonable to me.

putting it in aptos-move/aptos-vm/src/... seems poluting it there.

))
}

// fn to_abort(status: TransactionStatus) -> TransactionOutput {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code. maybe remove

total_supply: AtomicU64,
}

// impl CommonNativeRawTransactionExecutor for NativeNoStorageRawTransactionExecutor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code

Copy link
Contributor

@manudhundi manudhundi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments..

But overall, seems good to me

@igor-aptos igor-aptos force-pushed the igor/native_executor_benchmarks branch 5 times, most recently from 4a8b8ed to 8a854ac Compare November 23, 2024 00:37
@igor-aptos igor-aptos force-pushed the igor/native_executor_benchmarks branch from 8a854ac to 4e0e2c9 Compare November 23, 2024 00:37
@igor-aptos igor-aptos enabled auto-merge (squash) November 23, 2024 00:38

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@igor-aptos igor-aptos force-pushed the igor/native_executor_benchmarks branch from 4e0e2c9 to dee9376 Compare November 23, 2024 01:58

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on dee9376e46f0698beb28349c283eb22b169be934

two traffics test: inner traffic : committed: 14082.96 txn/s, latency: 2824.37 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 4700 ms), latency samples: 5354680
two traffics test : committed: 100.03 txn/s, latency: 2427.88 ms, (p50: 1500 ms, p70: 2000, p90: 2600 ms, p99: 16800 ms), latency samples: 1720
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.275, avg: 1.307", "ConsensusProposalToOrdered: max: 0.325, avg: 0.294", "ConsensusOrderedToCommit: max: 0.377, avg: 0.364", "ConsensusProposalToCommit: max: 0.669, avg: 0.658"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.16s no progress at version 2572668 (avg 0.20s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 15.51s no progress at version 2572666 (avg 15.51s) [limit 16].
Test Ok

Copy link
Contributor

✅ Forge suite compat success on 327445f70a3b9474f13a792e8ada9b85809dec86 ==> dee9376e46f0698beb28349c283eb22b169be934

Compatibility test results for 327445f70a3b9474f13a792e8ada9b85809dec86 ==> dee9376e46f0698beb28349c283eb22b169be934 (PR)
1. Check liveness of validators at old version: 327445f70a3b9474f13a792e8ada9b85809dec86
compatibility::simple-validator-upgrade::liveness-check : committed: 13794.53 txn/s, latency: 2419.49 ms, (p50: 2100 ms, p70: 2200, p90: 2500 ms, p99: 7700 ms), latency samples: 521020
2. Upgrading first Validator to new version: dee9376e46f0698beb28349c283eb22b169be934
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7224.32 txn/s, latency: 3930.64 ms, (p50: 4500 ms, p70: 4700, p90: 4800 ms, p99: 4900 ms), latency samples: 133260
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6600.96 txn/s, latency: 4622.69 ms, (p50: 4800 ms, p70: 4900, p90: 5000 ms, p99: 6600 ms), latency samples: 249620
3. Upgrading rest of first batch to new version: dee9376e46f0698beb28349c283eb22b169be934
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7767.45 txn/s, latency: 3719.71 ms, (p50: 4200 ms, p70: 4400, p90: 4500 ms, p99: 4600 ms), latency samples: 145420
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6978.09 txn/s, latency: 4376.00 ms, (p50: 4400 ms, p70: 4500, p90: 4600 ms, p99: 6400 ms), latency samples: 262640
4. upgrading second batch to new version: dee9376e46f0698beb28349c283eb22b169be934
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 11202.95 txn/s, latency: 2473.59 ms, (p50: 2600 ms, p70: 2800, p90: 3200 ms, p99: 3300 ms), latency samples: 196080
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11169.49 txn/s, latency: 2866.49 ms, (p50: 2700 ms, p70: 3300, p90: 3500 ms, p99: 4200 ms), latency samples: 361620
5. check swarm health
Compatibility test for 327445f70a3b9474f13a792e8ada9b85809dec86 ==> dee9376e46f0698beb28349c283eb22b169be934 passed
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on 327445f70a3b9474f13a792e8ada9b85809dec86 ==> dee9376e46f0698beb28349c283eb22b169be934

Compatibility test results for 327445f70a3b9474f13a792e8ada9b85809dec86 ==> dee9376e46f0698beb28349c283eb22b169be934 (PR)
Upgrade the nodes to version: dee9376e46f0698beb28349c283eb22b169be934
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1272.51 txn/s, submitted: 1276.04 txn/s, failed submission: 3.54 txn/s, expired: 3.54 txn/s, latency: 2361.67 ms, (p50: 2100 ms, p70: 2400, p90: 3600 ms, p99: 5400 ms), latency samples: 115180
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1290.00 txn/s, submitted: 1292.94 txn/s, failed submission: 2.93 txn/s, expired: 2.93 txn/s, latency: 2317.81 ms, (p50: 2100 ms, p70: 2400, p90: 3600 ms, p99: 4800 ms), latency samples: 114360
5. check swarm health
Compatibility test for 327445f70a3b9474f13a792e8ada9b85809dec86 ==> dee9376e46f0698beb28349c283eb22b169be934 passed
Upgrade the remaining nodes to version: dee9376e46f0698beb28349c283eb22b169be934
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1332.03 txn/s, submitted: 1336.29 txn/s, failed submission: 4.26 txn/s, expired: 4.26 txn/s, latency: 2328.04 ms, (p50: 2100 ms, p70: 2400, p90: 3900 ms, p99: 5400 ms), latency samples: 118820
Test Ok

@igor-aptos igor-aptos merged commit 516f32e into main Nov 23, 2024
48 checks passed
@igor-aptos igor-aptos deleted the igor/native_executor_benchmarks branch November 23, 2024 02:56
Self::get_value(account_key, state_view)
}

pub fn get_coin_store(
pub fn get_fa_store(
store_key: &StateKey,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this work?
fa store is in the rg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In db, yes. But that is StateView's implementation detail - StateView handles the translation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-execution-performance-full-test Run execution performance test (full version) CICD:run-execution-performance-test Run execution performance test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants