Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve chainbase mapped and heap behavior #1691

Merged
merged 23 commits into from
Oct 5, 2023
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
a098946
Use Chainbase's branch with `mapped` mode updates.
greg7mdp Sep 28, 2023
121b0f7
Update chainbase to branch tip.
greg7mdp Sep 28, 2023
2914f46
Use `mapped_shared` mode for chainbase when loading snapshot.
greg7mdp Sep 28, 2023
63e6bef
Try making the new mode not the default one (renaming new mode `mappe…
greg7mdp Sep 29, 2023
73a6d22
Load snapshot in `mapped_shared` in leap-util, and revert to new mode…
greg7mdp Sep 29, 2023
7da7862
Merge branch 'main' of github.com:AntelopeIO/leap into gh_1650
greg7mdp Sep 29, 2023
8140958
Update to appbase branch tip.
greg7mdp Sep 29, 2023
418a570
Update chainbase to tip:
greg7mdp Sep 30, 2023
811192f
Update chainbase to tip.
greg7mdp Oct 2, 2023
797f045
Update appbase to branch tip.
greg7mdp Oct 2, 2023
ca9b6a9
If `mapped` mode was requested, revert to it after loading snapshot.
greg7mdp Oct 2, 2023
5618bd1
Update chainbase to branch tip.
greg7mdp Oct 3, 2023
bb86cf0
Call chainbase API to give the opportunity to flush some dirty pages.
greg7mdp Oct 3, 2023
fc421a9
Address PR comments (log info message in controller, naming)
greg7mdp Oct 3, 2023
43006e2
Make the new mode non-default; rename it `mapped_private`.
greg7mdp Oct 3, 2023
b02fa00
Call `check_memory_and_flush_if_needed()` only in write window as Mat…
greg7mdp Oct 3, 2023
6f0782d
Add snapshot load time to info log
greg7mdp Oct 4, 2023
958bc01
Merge branch 'main' of github.com:AntelopeIO/leap into gh_1650
greg7mdp Oct 4, 2023
5c9ebe7
Address PR comment (remove unneeded line)
greg7mdp Oct 4, 2023
eefda26
Add `mapped_private` description to `--help` and docs.
greg7mdp Oct 4, 2023
e023358
Merge branch 'main' of github.com:AntelopeIO/leap into gh_1650
greg7mdp Oct 4, 2023
f427935
Update to chainbase tip (`main` branch)
greg7mdp Oct 5, 2023
d8fb38c
Merge branch 'main' of github.com:AntelopeIO/leap into gh_1650
greg7mdp Oct 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions libraries/chain/controller.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@

#include <new>
#include <shared_mutex>
#include <utility>

namespace eosio { namespace chain {

Expand Down Expand Up @@ -605,7 +606,9 @@ struct controller_impl {
}
ilog( "Snapshot loaded, lib: ${lib}", ("lib", head->block_num) );

init(check_shutdown);
init(std::move(check_shutdown));
if (conf.revert_to_mapped_mode)
db.revert_to_mapped_mode();
ilog( "Finished initialization from snapshot" );
} catch (boost::interprocess::bad_alloc& e) {
elog( "Failed initialization from snapshot - db storage not configured to have enough storage for the provided snapshot, please increase and retry snapshot" );
Expand All @@ -621,7 +624,7 @@ struct controller_impl {
("genesis_chain_id", genesis_chain_id)("controller_chain_id", chain_id)
);

this->shutdown = shutdown;
this->shutdown = std::move(shutdown);
if( fork_db.head() ) {
if( read_mode == db_read_mode::IRREVERSIBLE && fork_db.head()->id != fork_db.root()->id ) {
fork_db.rollback_head_to_root();
Expand All @@ -643,14 +646,14 @@ struct controller_impl {
} else {
blog.reset( genesis, head->block );
}
init(check_shutdown);
init(std::move(check_shutdown));
}

void startup(std::function<void()> shutdown, std::function<bool()> check_shutdown) {
EOS_ASSERT( db.revision() >= 1, database_exception, "This version of controller::startup does not work with a fresh state database." );
EOS_ASSERT( fork_db.head(), fork_database_exception, "No existing fork database despite existing chain state. Replay required." );

this->shutdown = shutdown;
this->shutdown = std::move(shutdown);
uint32_t lib_num = fork_db.root()->block_num;
auto first_block_num = blog.first_block_num();
if( auto blog_head = blog.head() ) {
Expand All @@ -673,7 +676,7 @@ struct controller_impl {
}
head = fork_db.head();

init(check_shutdown);
init(std::move(check_shutdown));
}


Expand Down
1 change: 1 addition & 0 deletions libraries/chain/include/eosio/chain/controller.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ namespace eosio { namespace chain {
bool disable_replay_opts = false;
bool contracts_console = false;
bool allow_ram_billing_in_notify = false;
bool revert_to_mapped_mode = false;
uint32_t maximum_variable_signature_length = chain::config::default_max_variable_signature_length;
bool disable_all_subjective_mitigations = false; //< for developer & testing purposes, can be configured using `disable-all-subjective-mitigations` when `EOSIO_DEVELOPER` build option is provided
uint32_t terminate_at_block = 0;
Expand Down
7 changes: 7 additions & 0 deletions plugins/chain_plugin/chain_plugin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -934,6 +934,13 @@ void chain_plugin_impl::plugin_initialize(const variables_map& options) {

chain_config->db_map_mode = options.at("database-map-mode").as<pinnable_mapped_file::map_mode>();

// when loading a snapshot, all the state will be modified, so use the `shared` mode instead
// of `copy_on_write` to lower memory requirements
if (snapshot_path && chain_config->db_map_mode == pinnable_mapped_file::mapped) {
chain_config->db_map_mode = pinnable_mapped_file::mapped_shared;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to maintain the new mapped mode for loading snapshots because I think it would be a huge perf boost (I've seen comments from some users it takes 30 minutes to load a WAX snapshot, but it takes me less than 5 minutes in heap mode.. I'm pretty sure it's disk grinding for those users).

If we're worried about leaving 100% dirty pages after loading a snapshot, maybe one option is chainbase could expose a flush() call that (synchronously) performs the write out so that all the pages are clean. and nodeos calls that after loading the snapshot but before continuing.

Copy link
Contributor Author

@greg7mdp greg7mdp Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to maintain the new mapped mode for loading snapshots because I think it would be a huge perf boost (I've seen comments from some users it takes 30 minutes to load a WAX snapshot, but it takes me less than 5 minutes in heap mode.. I'm pretty sure it's disk grinding for those users).

This would be useful when you have just the right amount of RAM that can hold all the state in RAM, but not quite enough for heap mode. I'm a little bit concerned that we may get more crashes this way. Maybe I can detect the available RAM, and according to the size of the disk db configured decide if it makes sense to use the new mapped mode (for example go for it if RAM_size > 1.1 x chain-state-db-size-mb).

If we're worried about leaving 100% dirty pages after loading a snapshot, maybe one option is chainbase could expose a flush() call that (synchronously) performs the write out so that all the pages are clean. and nodeos calls that after loading the snapshot but before continuing.

Yes that's a great idea. We can still use the soft-dirty thing as is the state-db-size is still configured much greater than the actual db_size used, it will make the flush faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add an ilog or maybe even a wlog that the mode was changed from specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon reflection, I think the best compromise when loading a snapshot is:

  • load the snapshot in mapped_shared mode. Yes it is probably slower, but it minimizes the odds of running out of memory. With the new mapped mode, we need memory for both the currently read data from the snapshot + the full chainbase db.
  • when the snapshot is done loading, use a new API as suggested by Matt which flushes the still dirty pages to disk and restart with a new copy_on_write mapping.

Do you guys agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked in the implementation of the above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heifner I updated the change to mapped_shared to be temporary (just while loading the snapshot), so I don't think a ilog is necessary, but I can add it if you think it might be useful still.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not needed if it honors the configuration after snapshot load.

chain_config->revert_to_mapped_mode = true; // revert to `mapped` mode after loading snapshot.
}

#ifdef EOSIO_EOS_VM_OC_RUNTIME_ENABLED
if( options.count("eos-vm-oc-cache-size-mb") )
chain_config->eosvmoc_config.cache_size = options.at( "eos-vm-oc-cache-size-mb" ).as<uint64_t>() * 1024u * 1024u;
Expand Down
4 changes: 4 additions & 0 deletions programs/leap-util/actions/snapshot.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,10 @@ int snapshot_actions::run_subcommand() {
cfg.state_size = opt->db_size * 1024 * 1024;
cfg.state_guard_size = opt->guard_size * 1024 * 1024;
cfg.eosvmoc_tierup = wasm_interface::vm_oc_enable::oc_none; // wasm not used, no use to fire up oc

// when loading a snapshot, all the state will be modified, so use the `shared` mode instead
// of `copy_on_write` to lower memory requirements
cfg.db_map_mode = pinnable_mapped_file::map_mode::mapped_shared;
protocol_feature_set pfs = initialize_protocol_features( std::filesystem::path("protocol_features"), false );

try {
Expand Down