-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve mapped
and head
modes.
#21
Conversation
Instead of two large file mappings, use a series of smapp mappings for the copy.
…earing the Soft-Dirty bits.
…d_heap_improvement
c63a9ed
to
4a8070e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't run any tests with this. Will lean on @spoonincode tests he is running.
#include <vector> | ||
#include <span> | ||
#include <boost/interprocess/managed_external_buffer.hpp> | ||
#include <boost/interprocess/anonymous_shared_memory.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be unused, and causes a compiler warning. Similarly for the same include file in pinnable_mapped_file.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I removed this and a couple other unneeded includes.
@@ -216,7 +300,7 @@ void pinnable_mapped_file::setup_non_file_mapping() { | |||
round_up_mmaped_size(_2mb); | |||
_non_file_mapped_mapping = mmap(NULL, _non_file_mapped_mapping_size, PROT_READ|PROT_WRITE, common_map_opts, VM_FLAGS_SUPERPAGE_SIZE_2MB, 0); | |||
if(_non_file_mapped_mapping != MAP_FAILED) { | |||
std::cerr << "CHAINBASE: Database \"" << _database_name << "\" using 2MB pages" << std::endl; | |||
std::cerr << "CHAINBASE: Database \"" << _database_name << "\" using 2MB pages" << '\n'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reasoning for changing all the std::endl
to '\n'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get a warning from clang-tidy that it is a performance issue. I believe this is the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for log messages you typically do want the flush at the end of a message. Otherwise you have no guarantee how promptly the output will be visible to the user (or recorded in to some log ingest system). In the case of the % complete
messages, without the flush none of these may be output until after 100% is already reached.
This can be seen with something like
#include <iostream>
#include <thread>
int main() {
using namespace std::chrono_literals;
std::cout << "Get Ready!" << std::endl;
for(unsigned i = 0; i < 60; ++i) {
std::cout << i << '\n';
std::this_thread::sleep_for(1s);
}
return 0;
}
Then run something like
./o > output.txt & tail -f output.txt
and you will likely not see any output beyond Get Ready
for an extended period of time (for me, until the application exits). Using endl
instead shows each number counting up.
So for a user who sends stderr to a log file they won't see our % complete
messages promptly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cerr
is non-buffered. Although, I would argue with sticking with std::endl
as it is more readable and the flush
should be a no-op for cerr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not add it back, unless you are worried it may cause a change of behavior. clang-tidy warns about it. And personally I don't find it more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cerr
is non-buffered
good point, I had forgot there was a difference there. So this change is really just ultimately a style change, and the clang-tidy commentary is not applicable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for log messages you typically do want the flush at the end of a message.
OK, I'll add the flush back. Would you prefer I revert to what it was before, or I use std::cout << "Hello\n" << std::flush;
as mentioned here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for log messages you typically do want the flush at the end of a message.
Like @heifner pointed out, since it's cerr
it's not buffered by default anyways, so that comment from me isn't accurate in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'm fine either way (leaving as is or reverting or using flush or ..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spoonincode at this point I'm happy to make any change (or none) that are necessary for your approval today. Just let me know.
src/pinnable_mapped_file.cpp
Outdated
|
||
if(time(nullptr) != t) { | ||
t = time(nullptr); | ||
std::cerr << "CHAINBASE: Preloading \"" << _database_name << "\" database file, " << offset/(_file_mapped_region.get_size()/100) << "% complete..." << std::endl; | ||
std::cerr << "CHAINBASE: Preloading \"" << _database_name << "\" database file, " << | ||
offset/(_file_mapped_region.get_size()/100) << "% complete..." << '\n'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting a divide by zero on this line when running nodeos like
nodeos --delete-all --snapshot /home/spoon/RAMsnaps/snapshot-1287571bdd7b7337bcaac970f9cbc03b69fcd039a0784068b97152f130b483c5.bin --chain-state-db-size-mb 24576 --database-map-mode heap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
offset += copy_size; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this write out the same 1GB of dirty pages over and over again? I would have thought you need a call to clear_refs()
to reset the bits back to clean. But.. the clear_refs
interface is all-or-nothing: so you will need to write out all dirty pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you are right, this doesn't help. I'll comment it out for now. I don't think the memory check I have is good enough to do a full dirty page write (I'm worried this might happen too often).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (kinda)
…rrectly as Matt pointed out.
// non-sharable chainbase dbs using mapped mode are flushed to disk | ||
// ---------------------------------------------------------------------------------- | ||
for (auto pmm : _instance_tracker) | ||
pmm->save_database_file(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism doesn't look thread safe at all -- if someone ctors a chainbase on a thread it can cause corruption for any other chainbase in the process if they're busy being written to in a different thread.
I don't think we need to protect against that since leap never does that.
But since we still maintain this project a submodule for others to use, I wonder if we should somehow mention this limitation anywhere (in the header file or something?).. not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll add a mention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
src/pinnable_mapped_file.cpp
Outdated
memcpy(dst, src+offset, copy_size); | ||
|
||
if (flush) { | ||
std::cerr << "CHAINBASE: Writing \"" << _database_name << "\" database file, flushing buffers..." << '\n'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this message makes sense to move in to the while loop. Previously it was only printed once: at the end. Now it's printed for every (non-zero) 1GB of memory -- it can even be printed more times than the status message if you're sinking more than 1GB/sec.
I think you can just get rid of this message entirely. Its original purpose was to indicate a final end stage that may not be accounted for in the percentage status messages. But if syncing every 1GB, that isn't needed any longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
} | ||
|
||
std::istream& operator>>(std::istream& in, pinnable_mapped_file::map_mode& runtime) { | ||
std::string s; | ||
in >> s; | ||
if (s == "mapped") | ||
runtime = pinnable_mapped_file::map_mode::mapped; | ||
else if (s == "mapped_private") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looking around at other option names thinking about consistency.. prior to 5.0 I think the only option values with-
/_
were eos-vm-jit
, eos-vm-oc
etc
It looks like 5.0's http-category-address
now adds options like chain_ro,127.0.0.1:8080
sooo.. unless we want to back walk those to be chain-ro
etc I guess mapped_private
is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not change that as I've already provided the writeup that was integrated into the release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that is unfortunate. Wish that would have been chain-ro
.
elseif(NOT CMAKE_CXX_STANDARD) | ||
set(CMAKE_CXX_STANDARD 17) | ||
set(CMAKE_CXX_STANDARD 20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not critical for this PR but something that can be done in the future,
This,
Line 2 in 7817736
cmake_minimum_required( VERSION 3.5 ) |
should be bumped to 3.12 as that's the first version that knows c++20.
Also this entire if/elseif/endif
block is logically nonsensical. My guess was it originally required c++11, and it would make sense in that case. I might suggest changing the way this is done to how the bls lib does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do in the next PR!
issue with
heap
mode: at leap startup and exit, we perform amemcpy
copy between two mappings of the full database size. This causes significant pressure on the file cache when physical RAM is less than 2x the database size__=> solution: use a series of smaller mappings for the copy, reducing RAM contention.
issue with
mapped
mode: when leap is running, linux will grind the disk into dust with its default dirty writeback algo. it's also a perf killer.=> solution: map the file with
MAP_PRIVATE
(so changes are not written back to the file), and on exit copy just the modified pages using a series of RW mappings (info on which pages are dirty is gathered using the pagemap interface. Should we still set thedirty
bit in the db file at startup?This new implementation of
mapped
mode, as well as the existingheap
andlocked
modes, does not allow sharing an opened database in RW mode with other instances in RO mode. In order to not completely remove the sharing functionality (tested in test.cpp), a newmapped_shared
mode is introduced, which is the same as the oldmapped
mode.Also:
C++20
debian:buster
todebian:bullseye
to getc++20
support (std::span
)