Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery #397

Draft
wants to merge 75 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
7ca757c
recovery files added
cassyunknown Feb 12, 2022
7e32771
removed workflows
cassyunknown Feb 12, 2022
26b55b5
added back workflows
cassyunknown Feb 12, 2022
356e4fd
removed newline from .gitignore
cassyunknown Feb 14, 2022
0503216
added descriptions for restore() and graceful_shutdown()
cassyunknown Feb 14, 2022
5ea5667
added full tempfile version
cassyunknown Feb 15, 2022
4ccf01f
changed derive traits to be in order
cassyunknown Feb 15, 2022
592a823
new function that generates hash_builder, to remove repeated code
cassyunknown Feb 15, 2022
67d5aa5
removed ccc comments
cassyunknown Feb 15, 2022
9caeec5
changed will_resore to restore. Moved hash_builder() so it is in scope
cassyunknown Feb 15, 2022
b4bed3a
changed will_resore to restore
cassyunknown Feb 15, 2022
d2966d2
took out of a Boxed struct so I can use helper function store() to c…
cassyunknown Feb 15, 2022
deaaa91
created store() and now will integreate into all demolish() functions
cassyunknown Feb 15, 2022
677200b
all of HashTable::demolish() uses store(). Will make similar changes …
cassyunknown Feb 15, 2022
3603c1b
updated store() so it also returns new offset
cassyunknown Feb 15, 2022
09a1f65
updated store() so it also returns new offset
cassyunknown Feb 15, 2022
21ba045
added helper function total_buckets() to reduce repeated code
cassyunknown Feb 15, 2022
7dc7d73
integrated store_bytes_and_update_offset() into Segments
cassyunknown Feb 15, 2022
45aa5b1
reduced number of variables from Segments::restore()
cassyunknown Feb 15, 2022
7c15eb8
added store_bytes_and_update_offset() usage to TtlBuckets::demolish()…
cassyunknown Feb 15, 2022
5d0e633
moved store_bytes_and_update_offset() to store.rs so there is only 1 …
cassyunknown Feb 15, 2022
aeb62de
changed back to only having from_builder() to build . This function d…
cassyunknown Feb 15, 2022
49c06e3
fixed ordering of derive traits for TtlBucket
cassyunknown Feb 15, 2022
1606340
removed unnecessary comments
cassyunknown Feb 15, 2022
1caaf22
removed newline
cassyunknown Feb 15, 2022
83e428a
removed newline
cassyunknown Feb 15, 2022
2df8f22
added TODOs to implement Drop trait for TtlBuckets
cassyunknown Feb 16, 2022
7be8f07
Added Brian's File::create() changes
cassyunknown Feb 16, 2022
19385a4
implemented PartialEq for HashTable
cassyunknown Feb 16, 2022
f7e9468
implemented PartialEq trait for TtlBuckets, Segments and Seg
cassyunknown Feb 16, 2022
710674d
removed _restored field from Seg and replaced it with a restored() fu…
cassyunknown Feb 16, 2022
a73aa63
Clone trait now implemented for Segments and Seg can now derive Clone…
cassyunknown Feb 16, 2022
7334b3f
changed implementation of PartialEq to be less awkward for Segments, …
cassyunknown Feb 17, 2022
9e9dd5c
ran cargo fmt
cassyunknown Feb 17, 2022
de08656
update File::create() documentation
cassyunknown Feb 17, 2022
6d77a73
update File::create() documentation
cassyunknown Feb 17, 2022
d7692ef
removed conditional derivations in Seg, Segments, HashTable and TtlBu…
cassyunknown Feb 17, 2022
8d72b48
cannot reproduce failing of test new_file_backed_cache_changed_and_re…
cassyunknown Feb 17, 2022
5b6dd2f
implemented From<Box<[u8]>> trait for Memory
cassyunknown Feb 17, 2022
f1ed123
Merge branch 'master' into recovery
cassyunknown Feb 17, 2022
bdd17b6
moved merged non-recovery tests to above recovery section
cassyunknown Feb 17, 2022
f47db51
fixed mismatched bracket introduced by merge
cassyunknown Feb 17, 2022
739365e
fixed mismatched bracket introduced by merge
cassyunknown Feb 17, 2022
186b9b4
added a Seg.flush() functio
cassyunknown Feb 17, 2022
ed43a6d
completed flush() for Segments
cassyunknown Feb 17, 2022
917855e
added flush() to TtlBuckets and HashTable. Next step: changes tests.r…
cassyunknown Feb 17, 2022
995544a
replaced demolish() with flush() in tests.rs. Deleted all traces of D…
cassyunknown Feb 17, 2022
43eef14
replaced demolish() with flush() in tests.rs. Deleted all traces of D…
cassyunknown Feb 17, 2022
d62bfb0
ran cargo fmt
cassyunknown Feb 17, 2022
92e1f24
ran cargo fmt
cassyunknown Feb 17, 2022
dc85b30
deleted tests that are attempting recovery from files that don't exis…
cassyunknown Feb 24, 2022
cefb4b9
uncommented File::create() check of expected size of file as there is…
cassyunknown Feb 24, 2022
6197220
Changed code so that HashTable and TtlBuckets share same file. Now wi…
cassyunknown Feb 24, 2022
154549b
refactored code to remove now unnecessary fields from HashTable and T…
cassyunknown Feb 24, 2022
eb671d0
neatened up comments for HashTable and TtlBuckets
cassyunknown Feb 24, 2022
62c953e
all of Segments, HashTable and ttlBuckets restored from same file
cassyunknown Feb 24, 2022
8816e11
ran cargo fmt
cassyunknown Feb 24, 2022
396e704
added a quit() function to be called when a Quit request is received
cassyunknown Mar 3, 2022
ea1e114
removed quit() as this is not what we want (termination of connection…
cassyunknown Mar 3, 2022
85531a3
added stop() function to be called when he request is Stop
cassyunknown Mar 3, 2022
2396be8
added stop() function to be called when he request is Stop
cassyunknown Mar 3, 2022
c641d8f
fixed seg tests so graceful shutdown was part of configuration
cassyunknown Mar 3, 2022
b1ed33c
added test config file
cassyunknown Mar 3, 2022
466a4c1
ran cargo fmt
cassyunknown Mar 3, 2022
7d7e952
fixed bug in config file
cassyunknown Mar 3, 2022
e8c0b6e
Added the Stop command wherever FlushAll command was implemented
cassyunknown Mar 7, 2022
85ab29a
stop signal sent to admin stops cache from processing events and shou…
cassyunknown Mar 8, 2022
7dc5c48
listener, admin and multi now return upon Stop
cassyunknown Mar 9, 2022
9d46425
attempted to add admin's own QueuePair to its signal_queue so it can …
cassyunknown Mar 9, 2022
2318a47
changed back to id=0 when admin receiving signals
cassyunknown Mar 9, 2022
be9a15a
undid Stop responsse through non-admin port
cassyunknown Mar 15, 2022
ae713b7
removed non-admin parsing of Stop command
cassyunknown Mar 15, 2022
553b333
updated config
cassyunknown Mar 16, 2022
befab11
ran cargo fmt
cassyunknown Mar 16, 2022
19e0147
removed unnecessary adding to admin queue
cassyunknown Mar 23, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

84 changes: 83 additions & 1 deletion src/rust/config/src/seg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ use serde::{Deserialize, Serialize};

const MB: usize = 1024 * 1024;

// restore and graceful shutdown options
const RESTORE: bool = false;
const GRACEFUL_SHUTDOWN: bool = false;

// defaults for hashtable
const HASH_POWER: u8 = 16;
const OVERFLOW_FACTOR: f64 = 1.0;
Expand All @@ -24,9 +28,18 @@ const COMPACT_TARGET: usize = 2;
const MERGE_TARGET: usize = 4;
const MERGE_MAX: usize = 8;

// datapool
// datapool (`Segments.data`)
const DATAPOOL_PATH: Option<&str> = None;

// `Segments` fields
const SEGMENT_FIELDS_PATH: Option<&str> = None;

// ttl buckets
const TTL_BUCKETS_PATH: Option<&str> = None;

// hashtable
const HASHTABLE_PATH: Option<&str> = None;

#[derive(Copy, Clone, Debug, Serialize, Deserialize)]
pub enum Eviction {
None,
Expand All @@ -39,6 +52,14 @@ pub enum Eviction {
}

// helper functions for default values
fn restore() -> bool {
RESTORE
}

fn graceful_shutdown() -> bool {
GRACEFUL_SHUTDOWN
}

fn hash_power() -> u8 {
HASH_POWER
}
Expand Down Expand Up @@ -75,9 +96,25 @@ fn datapool_path() -> Option<String> {
DATAPOOL_PATH.map(|v| v.to_string())
}

fn segments_fields_path() -> Option<String> {
SEGMENT_FIELDS_PATH.map(|v| v.to_string())
}

fn ttl_buckets_path() -> Option<String> {
TTL_BUCKETS_PATH.map(|v| v.to_string())
}

fn hashtable_path() -> Option<String> {
HASHTABLE_PATH.map(|v| v.to_string())
}

// definitions
#[derive(Serialize, Deserialize, Debug)]
pub struct Seg {
#[serde(default = "restore")]
restore: bool,
#[serde(default = "graceful_shutdown")]
graceful_shutdown: bool,
#[serde(default = "hash_power")]
hash_power: u8,
#[serde(default = "overflow_factor")]
Expand All @@ -96,11 +133,19 @@ pub struct Seg {
compact_target: usize,
#[serde(default = "datapool_path")]
datapool_path: Option<String>,
#[serde(default = "segments_fields_path")]
segments_fields_path: Option<String>,
#[serde(default = "ttl_buckets_path")]
ttl_buckets_path: Option<String>,
#[serde(default = "hashtable_path")]
hashtable_path: Option<String>,
}

impl Default for Seg {
fn default() -> Self {
Self {
restore: restore(),
graceful_shutdown: graceful_shutdown(),
hash_power: hash_power(),
overflow_factor: overflow_factor(),
heap_size: heap_size(),
Expand All @@ -110,12 +155,31 @@ impl Default for Seg {
merge_max: merge_max(),
compact_target: compact_target(),
datapool_path: datapool_path(),
segments_fields_path: segments_fields_path(),
ttl_buckets_path: ttl_buckets_path(),
hashtable_path: hashtable_path(),
}
}
}

// implementation
impl Seg {
// Determines if the `Seg` will be restored.
// The restoration will be successful if `datapool_path`, `segments_fields_path`
// `ttl_buckets_path` and `hashtable_path` are valid paths.
// Otherwise, the `Seg` will be created as new.
pub fn restore(&self) -> bool {
cassyunknown marked this conversation as resolved.
Show resolved Hide resolved
self.restore
}

// Determines if the `Seg` will be gracefully shutdown.
// The graceful shutdown will be successful if the cache is file backed
// and `segments_fields_path`, `ttl_buckets_path` and `hashtable_path` are
// valid paths to save the relevant `Seg` fields to.
// Otherwise, the relevant `Seg` fields will not be saved.
pub fn graceful_shutdown(&self) -> bool {
cassyunknown marked this conversation as resolved.
Show resolved Hide resolved
self.graceful_shutdown
}
pub fn hash_power(&self) -> u8 {
self.hash_power
}
Expand Down Expand Up @@ -151,6 +215,24 @@ impl Seg {
pub fn datapool_path(&self) -> Option<PathBuf> {
self.datapool_path.as_ref().map(|v| Path::new(v).to_owned())
}

pub fn segments_fields_path(&self) -> Option<PathBuf> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should each of these paths be it's own config option? I suspect we can be opinionated about the names for each file if we decide to keep the parts separate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was going to wait until we decided how many files to use

self.segments_fields_path
.as_ref()
.map(|v| Path::new(v).to_owned())
}

pub fn ttl_buckets_path(&self) -> Option<PathBuf> {
self.ttl_buckets_path
.as_ref()
.map(|v| Path::new(v).to_owned())
}

pub fn hashtable_path(&self) -> Option<PathBuf> {
self.hashtable_path
.as_ref()
.map(|v| Path::new(v).to_owned())
}
}

// trait definitions
Expand Down
14 changes: 14 additions & 0 deletions src/rust/entrystore/src/seg/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,16 +43,30 @@ impl Seg {

// build the datastructure from the config
let data = ::seg::Seg::builder()
.restore(config.restore())
.hash_power(config.hash_power())
.overflow_factor(config.overflow_factor())
.heap_size(config.heap_size())
.segment_size(config.segment_size())
.eviction(eviction)
.datapool_path(config.datapool_path())
.segments_fields_path(config.segments_fields_path())
.ttl_buckets_path(config.ttl_buckets_path())
.hashtable_path(config.hashtable_path())
.build();

Self { data }
}

/// Flush (gracefully shutdown) the `Seg` cache if configured to do so
pub fn flush<T: SegConfig>(self, config: &T) {
let config = config.seg();
Copy link
Author

@cassyunknown cassyunknown Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a trait that I should implement (that only has the flush() required method) so I just put this function as a method of Seg. I did the same thing for the other files I added a flush() to


if config.graceful_shutdown() {
// TODO: check if successfully shutdown and record result
self.data.flush();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line currently produces a warning as I am ignoring the Result. Ideally the result would be intepreted here or passed along to server/segcache/src/lib.rs where I intend this flush() function to be called

};
}
}

impl EntryStore for Seg {
Expand Down
1 change: 1 addition & 0 deletions src/rust/server/segcache/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ impl Segcache {
/// fully terminated. This is more likely to be used for running integration
/// tests or other automated testing.
pub fn shutdown(self) {
// TODO: flush the cache
self.process.shutdown()
}
}
Expand Down
1 change: 1 addition & 0 deletions src/rust/storage/seg/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ rand_chacha = { version = "0.3.0" }
rand_xoshiro = { version = "0.6.0" }
storage-types = { path = "../types" }
thiserror = "1.0.24"
tempfile = "3.3.0"

[dev-dependencies]
criterion = "0.3.4"
81 changes: 74 additions & 7 deletions src/rust/storage/seg/src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,26 +6,41 @@

use crate::*;
use std::path::Path;
use std::path::PathBuf;

/// A builder that is used to construct a new [`Seg`] instance.
pub struct Builder {
restore: bool,
hash_power: u8,
overflow_factor: f64,
segments_builder: SegmentsBuilder,
ttl_buckets_path: Option<PathBuf>,
hashtable_path: Option<PathBuf>,
}

// Defines the default parameters
impl Default for Builder {
fn default() -> Self {
Self {
restore: false,
hash_power: 16,
overflow_factor: 0.0,
segments_builder: SegmentsBuilder::default(),
ttl_buckets_path: None,
hashtable_path: None,
}
}
}

impl Builder {
/// Specify to `Builder` and `SegmentsBuilder` whether the cache will be restored.
/// Otherwise, the cache will be created and treated as new.
pub fn restore(mut self, restore: bool) -> Self {
self.restore = restore;
self.segments_builder = self.segments_builder.restore(restore);
self
}

/// Specify the hash power, which limits the size of the hashtable to 2^N
/// entries. 1/8th of these are used for metadata storage, meaning that the
/// total number of items which can be held in the cache is limited to
Expand Down Expand Up @@ -135,17 +150,35 @@ impl Builder {
self
}

/// Specify a backing file to be used for segment storage.
///
/// # Panics
///
/// This will panic if the file already exists
/// Specify a backing file to be used for `Segments.data` storage.
pub fn datapool_path<T: AsRef<Path>>(mut self, path: Option<T>) -> Self {
self.segments_builder = self.segments_builder.datapool_path(path);
self
}

/// Specify a backing file to be used for `Segments` fields' storage.
pub fn segments_fields_path<T: AsRef<Path>>(mut self, path: Option<T>) -> Self {
self.segments_builder = self.segments_builder.segments_fields_path(path);
self
}

/// Specify a backing file to be used for `TtlBuckets` storage.
pub fn ttl_buckets_path<T: AsRef<Path>>(mut self, path: Option<T>) -> Self {
self.ttl_buckets_path = path.map(|p| p.as_ref().to_owned());
self
}

/// Specify a backing file to be used for `HashTable` storage.
pub fn hashtable_path<T: AsRef<Path>>(mut self, path: Option<T>) -> Self {
self.hashtable_path = path.map(|p| p.as_ref().to_owned());
self
}

/// Consumes the builder and returns a fully-allocated `Seg` instance.
/// If `restore` and valid paths to the structures are given, `Seg` will
/// be restored. Otherwise, create a new `Seg` instance. If valid paths are
/// given, the files at these paths will be used to copy the structures to
/// upon graceful shutdown.
///
/// ```
/// use seg::{Policy, Seg};
Expand All @@ -159,10 +192,44 @@ impl Builder {
/// .eviction(Policy::Random).build();
/// ```
pub fn build(self) -> Seg {
let hashtable = HashTable::new(self.hash_power, self.overflow_factor);
// Build `Segments`. If there is a path for the datapool set, the
// `Segments.data` will be file backed. If `restore` and there is a path
// for the `Segments` fields, restore the other relevant `Segments`
// fields.
let segments = self.segments_builder.build();
let ttl_buckets = TtlBuckets::default();

// If `Segments` successfully restored and `restore`
if segments.fields_copied_back && self.restore {
// Attempt to restore `HashTable` and `TtlBuckets`
let hashtable = HashTable::restore(
self.hashtable_path.clone(),
self.hash_power,
self.overflow_factor,
);
let ttl_buckets = TtlBuckets::restore(self.ttl_buckets_path.clone());

// If successful, return a restored segcache
if hashtable.table_copied_back && ttl_buckets.buckets_copied_back {
return Seg {
hashtable,
segments,
ttl_buckets,
};
}
}

// TODO: Should paths be checked here to see if any are None (or not
// valid)? Then we could take an "All or Nothing" approach. That is, if
// one of the paths is not valid, then all structures are created
// as new AND no paths are set for graceful shutdown. Otherwise, if
// `restore`, we restore from these paths, else, we set these paths.
// Currently, I am not doing this as due to the Segments having a
// separate builder + different control flow, it is too awkward to
// implement.

// If not `restore` or restoration failed, create a new cache
let hashtable = HashTable::new(self.hashtable_path, self.hash_power, self.overflow_factor);
let ttl_buckets = TtlBuckets::new(self.ttl_buckets_path);
Seg {
hashtable,
segments,
Expand Down
Loading