-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve mapped
and head
modes.
#21
Merged
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
e5cf68b
Incremental read/write for `heap` mode to reduce memory contention
greg7mdp 8213c33
Finish implementing the `readonly` mapped mode.
greg7mdp c6735eb
In `mapped` mode, save only modified pages at exit.
greg7mdp 93cd1d3
Update cicd to use `debian:bullseye` instead of `debian:buster`.
greg7mdp 19ee2e0
Avoid multiple calls to `msync`
greg7mdp 235d956
Use boost interprocess mmap APIs
greg7mdp 66d3326
Reuse `_file_mapping` instead of creating a new `bip::file_mapping`
greg7mdp 832805a
Fix my previous change for flushing the region to disk.
greg7mdp 4cde714
Add `instance tracker` so that we can flush all dbs to disk before cl…
greg7mdp 4bc07e4
Cleanup error cases.
greg7mdp 619ba1f
code cleanup and renaming some members.
greg7mdp 4dcbb00
Add missing Boost random dependency (needed in Leap).
greg7mdp 219e89b
Update boost version
greg7mdp 6b4eda4
Remove `benchmark` from default build.
greg7mdp 6e1aa5a
Reduce overlap of memory mappings existence.
greg7mdp d6c1dcc
Add description for `clear_refs_failed` error
greg7mdp 4a8070e
Merge branch 'main' of github.com:AntelopeIO/chainbase into mapped_an…
greg7mdp c3352cc
Remove unused code.
greg7mdp d275422
Add extra test mode `mapped_shared`.
greg7mdp 65eefd4
Make sure we don't try to use the `pagemap` feature on platforms wher…
greg7mdp abc648c
Remove leftover comment not necessary anymore.
greg7mdp da2910c
Address PR comments.
greg7mdp 7ae2b7c
Add another commment.
greg7mdp e7a9b5a
Check for db file on tempfs and refuse to start unless in `mapped_sha…
greg7mdp 6cce710
Add API to flush RW db and convert to RO mapping after snapshot.
greg7mdp 4ced7af
Fix `divide by zero` in `heap` mode.
greg7mdp 44c9a20
Remove some unneeded includes.
greg7mdp 4b7cf64
`mapped` mode: add code to write some pages to disk when available RA…
greg7mdp 4ab8944
Address PR comments.
greg7mdp 173287c
Make new node the non-default one (`mapped_private`)
greg7mdp 7ff3038
Disable `check_memory_and_flush_if_needed()` which was not working co…
greg7mdp d928ec5
Address PR comment
greg7mdp 7817736
Remove unneeded `std::cerr` message as per PR comment.
greg7mdp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
#pragma once | ||
|
||
#include <fcntl.h> // open | ||
#include <unistd.h> // pread, sysconf | ||
#include <cstdlib> | ||
#include <cassert> | ||
#include <iostream> | ||
#include <fstream> | ||
#include <filesystem> | ||
#include <vector> | ||
#include <span> | ||
#include <boost/interprocess/managed_mapped_file.hpp> | ||
|
||
namespace chainbase { | ||
|
||
namespace bip = boost::interprocess; | ||
|
||
class pagemap_accessor { | ||
public: | ||
~pagemap_accessor() { | ||
_close(); | ||
} | ||
|
||
bool clear_refs() const { | ||
if constexpr (!_pagemap_supported) | ||
return false; | ||
|
||
int fd = ::open("/proc/self/clear_refs", O_WRONLY); | ||
if (fd < 0) | ||
return false; | ||
|
||
// Clear soft-dirty bits from the task's PTEs. | ||
// This is done by writing "4" into the /proc/PID/clear_refs file of the task in question. | ||
// | ||
// After this, when the task tries to modify a page at some virtual address, the #PF occurs | ||
// and the kernel sets the soft-dirty bit on the respective PTE. | ||
// ---------------------------------------------------------------------------------------- | ||
const char *v = "4"; | ||
heifner marked this conversation as resolved.
Show resolved
Hide resolved
|
||
bool res = write(fd, v, 1) == 1; | ||
::close(fd); | ||
return res; | ||
} | ||
|
||
static constexpr bool pagemap_supported() { | ||
return _pagemap_supported; | ||
} | ||
|
||
static bool is_marked_dirty(uint64_t entry) { | ||
return !!(entry & (1Ull << 55)); | ||
} | ||
|
||
static size_t page_size() { | ||
return pagesz; | ||
} | ||
|
||
bool page_dirty(uintptr_t vaddr) const { | ||
uint64_t data; | ||
if (!read(vaddr, { &data, 1 })) | ||
return true; | ||
return this->is_marked_dirty(data); | ||
} | ||
|
||
// /proc/pid/pagemap. This file lets a userspace process find out which physical frame each virtual page | ||
// is mapped to. It contains one 64-bit value for each virtual page, containing the following data | ||
// (from fs/proc/task_mmu.c, above pagemap_read): | ||
// | ||
// Bits 0-54 page frame number (PFN) if present (note: field is zeroed for non-privileged users) | ||
// Bits 0-4 swap type if swapped | ||
// Bits 5-54 swap offset if swapped | ||
// Bit 55 pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst) | ||
// Bit 56 page exclusively mapped (since 4.2) | ||
// Bit 57 pte is uffd-wp write-protected (since 5.13) (see Documentation/admin-guide/mm/userfaultfd.rst) | ||
// Bits 58-60 zero | ||
// Bit 61 page is file-page or shared-anon (since 3.5) | ||
// Bit 62 page swapped | ||
// Bit 63 page present | ||
// | ||
// Here we are just checking bit #55 (the soft-dirty bit). | ||
// ---------------------------------------------------------------------------------------------------- | ||
bool read(uintptr_t vaddr, std::span<uint64_t> dest_uint64) const { | ||
if constexpr (!_pagemap_supported) | ||
return false; | ||
|
||
if (!_open()) // make sure file is open | ||
return false; | ||
assert(_pagemap_fd >= 0); | ||
auto dest = std::as_writable_bytes(dest_uint64); | ||
std::byte* cur = dest.data(); | ||
size_t bytes_remaining = dest.size(); | ||
uintptr_t offset = (vaddr / pagesz) * sizeof(uint64_t); | ||
while (bytes_remaining != 0) { | ||
ssize_t ret = pread(_pagemap_fd, cur, bytes_remaining, offset + (cur - dest.data())); | ||
if (ret < 0) | ||
return false; | ||
bytes_remaining -= (size_t)ret; | ||
cur += ret; | ||
} | ||
return true; | ||
} | ||
|
||
// copies the modified pages with the virtual address space specified by `rgn` to an | ||
// equivalent region starting at `offest` within the (open) file pointed by `fd`. | ||
// The specified region *must* be a multiple of the system's page size, and the specified | ||
// region should exist in the disk file. | ||
// -------------------------------------------------------------------------------------- | ||
bool update_file_from_region(std::span<std::byte> rgn, bip::file_mapping& mapping, size_t offset, bool flush, size_t& written_pages) const { | ||
if constexpr (!_pagemap_supported) | ||
return false; | ||
|
||
assert(rgn.size() % pagesz == 0); | ||
size_t num_pages = rgn.size() / pagesz; | ||
std::vector<uint64_t> pm(num_pages); | ||
|
||
// get modified pages | ||
if (!read((uintptr_t)rgn.data(), pm)) | ||
return false; | ||
bip::mapped_region map_rgn(mapping, bip::read_write, offset, rgn.size()); | ||
std::byte* dest = (std::byte*)map_rgn.get_address(); | ||
if (dest) { | ||
for (size_t i=0; i<num_pages; ++i) { | ||
if (is_marked_dirty(pm[i])) { | ||
size_t j = i + 1; | ||
while (j<num_pages && is_marked_dirty(pm[j])) | ||
++j; | ||
memcpy(dest + (i * pagesz), rgn.data() + (i * pagesz), pagesz * (j - i)); | ||
written_pages += (j - i); | ||
i += j - i - 1; | ||
} | ||
} | ||
if (flush && !map_rgn.flush(0, rgn.size(), /* async = */ false)) | ||
std::cerr << "CHAINBASE: ERROR: flushing buffers failed" << '\n'; | ||
return true; | ||
} | ||
return false; | ||
} | ||
|
||
private: | ||
bool _open() const { | ||
assert(_pagemap_supported); | ||
if (_pagemap_fd < 0) { | ||
_pagemap_fd = ::open("/proc/self/pagemap", O_RDONLY); | ||
if (_pagemap_fd < 0) | ||
return false; | ||
} | ||
return true; | ||
} | ||
|
||
bool _close() const { | ||
if (_pagemap_fd >= 0) { | ||
assert(_pagemap_supported); | ||
::close(_pagemap_fd); | ||
_pagemap_fd = -1; | ||
} | ||
return true; | ||
} | ||
|
||
static inline size_t pagesz = sysconf(_SC_PAGE_SIZE); | ||
|
||
#if defined(__linux__) && defined(__x86_64__) | ||
static constexpr bool _pagemap_supported = true; | ||
#else | ||
static constexpr bool _pagemap_supported = false; | ||
#endif | ||
|
||
mutable int _pagemap_fd = -1; | ||
}; | ||
|
||
} // namespace chainbase |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not critical for this PR but something that can be done in the future,
This,
chainbase/CMakeLists.txt
Line 2 in 7817736
should be bumped to 3.12 as that's the first version that knows c++20.
Also this entire
if/elseif/endif
block is logically nonsensical. My guess was it originally required c++11, and it would make sense in that case. I might suggest changing the way this is done to how the bls lib does it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do in the next PR!