Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve mapped and head modes. #21

Merged
merged 33 commits into from
Oct 5, 2023
Merged

Conversation

greg7mdp
Copy link
Contributor

@greg7mdp greg7mdp commented Sep 25, 2023

  • issue with heap mode: at leap startup and exit, we perform a memcpy copy between two mappings of the full database size. This causes significant pressure on the file cache when physical RAM is less than 2x the database size

    __=> solution: use a series of smaller mappings for the copy, reducing RAM contention.

  • issue with mapped mode: when leap is running, linux will grind the disk into dust with its default dirty writeback algo. it's also a perf killer.

    => solution: map the file with MAP_PRIVATE (so changes are not written back to the file), and on exit copy just the modified pages using a series of RW mappings (info on which pages are dirty is gathered using the pagemap interface. Should we still set the dirty bit in the db file at startup?

    This new implementation of mapped mode, as well as the existing heap and locked modes, does not allow sharing an opened database in RW mode with other instances in RO mode. In order to not completely remove the sharing functionality (tested in test.cpp), a new mapped_shared mode is introduced, which is the same as the old mapped mode.

Also:

  • update minimum C++ standard requirement to C++20
  • update ci/cd platform from debian:buster to debian:bullseye to get c++20 support (std::span)

@greg7mdp greg7mdp marked this pull request as draft September 26, 2023 21:02
CMakeLists.txt Outdated Show resolved Hide resolved
include/chainbase/pagemap_accessor.hpp Show resolved Hide resolved
include/chainbase/pagemap_accessor.hpp Outdated Show resolved Hide resolved
include/chainbase/pinnable_mapped_file.hpp Show resolved Hide resolved
@greg7mdp greg7mdp marked this pull request as ready for review October 2, 2023 14:35
Copy link
Member

@heifner heifner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't run any tests with this. Will lean on @spoonincode tests he is running.

#include <vector>
#include <span>
#include <boost/interprocess/managed_external_buffer.hpp>
#include <boost/interprocess/anonymous_shared_memory.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be unused, and causes a compiler warning. Similarly for the same include file in pinnable_mapped_file.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I removed this and a couple other unneeded includes.

@@ -216,7 +300,7 @@ void pinnable_mapped_file::setup_non_file_mapping() {
round_up_mmaped_size(_2mb);
_non_file_mapped_mapping = mmap(NULL, _non_file_mapped_mapping_size, PROT_READ|PROT_WRITE, common_map_opts, VM_FLAGS_SUPERPAGE_SIZE_2MB, 0);
if(_non_file_mapped_mapping != MAP_FAILED) {
std::cerr << "CHAINBASE: Database \"" << _database_name << "\" using 2MB pages" << std::endl;
std::cerr << "CHAINBASE: Database \"" << _database_name << "\" using 2MB pages" << '\n';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reasoning for changing all the std::endl to '\n'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a warning from clang-tidy that it is a performance issue. I believe this is the reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for log messages you typically do want the flush at the end of a message. Otherwise you have no guarantee how promptly the output will be visible to the user (or recorded in to some log ingest system). In the case of the % complete messages, without the flush none of these may be output until after 100% is already reached.

This can be seen with something like

#include <iostream>
#include <thread>
int main() {
	using namespace std::chrono_literals;
	std::cout << "Get Ready!" << std::endl;
	for(unsigned i = 0; i < 60; ++i) {
	        std::cout << i << '\n';
	        std::this_thread::sleep_for(1s);
	}
	return 0;
}

Then run something like

./o > output.txt & tail -f output.txt

and you will likely not see any output beyond Get Ready for an extended period of time (for me, until the application exits). Using endl instead shows each number counting up.

So for a user who sends stderr to a log file they won't see our % complete messages promptly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cerr is non-buffered. Although, I would argue with sticking with std::endl as it is more readable and the flush should be a no-op for cerr.

Copy link
Contributor Author

@greg7mdp greg7mdp Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not add it back, unless you are worried it may cause a change of behavior. clang-tidy warns about it. And personally I don't find it more readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cerr is non-buffered

good point, I had forgot there was a difference there. So this change is really just ultimately a style change, and the clang-tidy commentary is not applicable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for log messages you typically do want the flush at the end of a message.

OK, I'll add the flush back. Would you prefer I revert to what it was before, or I use std::cout << "Hello\n" << std::flush; as mentioned here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for log messages you typically do want the flush at the end of a message.

Like @heifner pointed out, since it's cerr it's not buffered by default anyways, so that comment from me isn't accurate in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'm fine either way (leaving as is or reverting or using flush or ..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spoonincode at this point I'm happy to make any change (or none) that are necessary for your approval today. Just let me know.


if(time(nullptr) != t) {
t = time(nullptr);
std::cerr << "CHAINBASE: Preloading \"" << _database_name << "\" database file, " << offset/(_file_mapped_region.get_size()/100) << "% complete..." << std::endl;
std::cerr << "CHAINBASE: Preloading \"" << _database_name << "\" database file, " <<
offset/(_file_mapped_region.get_size()/100) << "% complete..." << '\n';
Copy link
Member

@spoonincode spoonincode Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting a divide by zero on this line when running nodeos like

nodeos --delete-all --snapshot /home/spoon/RAMsnaps/snapshot-1287571bdd7b7337bcaac970f9cbc03b69fcd039a0784068b97152f130b483c5.bin --chain-state-db-size-mb 24576 --database-map-mode heap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

offset += copy_size;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this write out the same 1GB of dirty pages over and over again? I would have thought you need a call to clear_refs() to reset the bits back to clean. But.. the clear_refs interface is all-or-nothing: so you will need to write out all dirty pages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right, this doesn't help. I'll comment it out for now. I don't think the memory check I have is good enough to do a full dirty page write (I'm worried this might happen too often).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (kinda)

// non-sharable chainbase dbs using mapped mode are flushed to disk
// ----------------------------------------------------------------------------------
for (auto pmm : _instance_tracker)
pmm->save_database_file(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mechanism doesn't look thread safe at all -- if someone ctors a chainbase on a thread it can cause corruption for any other chainbase in the process if they're busy being written to in a different thread.

I don't think we need to protect against that since leap never does that.

But since we still maintain this project a submodule for others to use, I wonder if we should somehow mention this limitation anywhere (in the header file or something?).. not sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll add a mention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

memcpy(dst, src+offset, copy_size);

if (flush) {
std::cerr << "CHAINBASE: Writing \"" << _database_name << "\" database file, flushing buffers..." << '\n';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this message makes sense to move in to the while loop. Previously it was only printed once: at the end. Now it's printed for every (non-zero) 1GB of memory -- it can even be printed more times than the status message if you're sinking more than 1GB/sec.

I think you can just get rid of this message entirely. Its original purpose was to indicate a final end stage that may not be accounted for in the percentage status messages. But if syncing every 1GB, that isn't needed any longer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

}

std::istream& operator>>(std::istream& in, pinnable_mapped_file::map_mode& runtime) {
std::string s;
in >> s;
if (s == "mapped")
runtime = pinnable_mapped_file::map_mode::mapped;
else if (s == "mapped_private")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looking around at other option names thinking about consistency.. prior to 5.0 I think the only option values with-/_ were eos-vm-jit, eos-vm-oc etc

It looks like 5.0's http-category-address now adds options like chain_ro,127.0.0.1:8080

sooo.. unless we want to back walk those to be chain-ro etc I guess mapped_private is fine

Copy link
Contributor Author

@greg7mdp greg7mdp Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not change that as I've already provided the writeup that was integrated into the release notes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that is unfortunate. Wish that would have been chain-ro.

elseif(NOT CMAKE_CXX_STANDARD)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD 20)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical for this PR but something that can be done in the future,

This,

cmake_minimum_required( VERSION 3.5 )

should be bumped to 3.12 as that's the first version that knows c++20.

Also this entire if/elseif/endif block is logically nonsensical. My guess was it originally required c++11, and it would make sense in that case. I might suggest changing the way this is done to how the bls lib does it.

Copy link
Contributor Author

@greg7mdp greg7mdp Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do in the next PR!

@greg7mdp greg7mdp merged commit 3bfb0d0 into main Oct 5, 2023
2 checks passed
@greg7mdp greg7mdp deleted the mapped_and_heap_improvement branch October 5, 2023 03:43
greg7mdp added a commit that referenced this pull request Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants