Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IF: Persist hotstuff safety state #1578

Merged
merged 17 commits into from
Nov 19, 2023

Conversation

systemzax
Copy link
Member

This branch addresses #1521.

@heifner heifner linked an issue Aug 30, 2023 that may be closed by this pull request
@@ -12,10 +12,46 @@ namespace eosio::chain {

using hs_bitset = boost::dynamic_bitset<uint32_t>;

static digest_type get_digest_to_sign(const block_id_type& block_id, uint8_t phase_counter, const fc::sha256& final_on_qc){
digest_type h1 = digest_type::hash( std::make_pair( block_id, phase_counter ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::make_pair( std::cref(block_id), phase_counter ) avoids copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you don't need the make_pair constructor!

Copy link
Contributor

@greg7mdp greg7mdp Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing the std::pair constructor:

    struct view_number {
       view_number() : _block_height(0), _phase_counter(0) {}
-      explicit view_number(std::pair<uint32_t, uint8_t> data) :
-         _block_height(data.first),
-         _phase_counter(data.second)
-      {
-      }
       
       explicit view_number(uint32_t block_height, uint8_t phase_counter) :
          _block_height(block_height),
@@ -44,9 +39,7 @@ namespace eosio::chain {
       uint64_t    to_uint64_t() const { return compute_height(_block_height, _phase_counter);   }
       std::string to_string()   const { return std::to_string(_block_height) + "::" + std::to_string(_phase_counter); }
 
-      uint64_t get_key() const { return get_view_number().to_uint64_t(); };
-
-      view_number get_view_number() const { return view_number{ block_height(), phase_counter() }; };
+      uint64_t get_key() const { return to_uint64_t(); };
 
       uint32_t _block_height;
       uint8_t  _phase_counter;
@@ -82,7 +75,7 @@ namespace eosio::chain {
       uint32_t block_num() const { return block_header::num_from_id(block_id); }
       uint64_t get_key() const { return compute_height(block_header::num_from_id(block_id), phase_counter); };
 
-      view_number get_view_number() const { return view_number(std::make_pair(block_header::num_from_id(block_id), phase_counter)); };
+      view_number get_view_number() const { return view_number(block_header::num_from_id(block_id), phase_counter); };
 
    };
 
modified   libraries/hotstuff/test/hotstuff_tools.cpp
@@ -56,8 +56,8 @@ BOOST_AUTO_TEST_CASE(view_number_tests) try {
 
   //test constructor
 
-  eosio::hotstuff::view_number vn_6 = eosio::hotstuff::view_number(std::make_pair(194217068, 2));
+  eosio::hotstuff::view_number vn_6(194217068, 2);
 
   BOOST_CHECK_EQUAL(vn_5 == vn_6, true);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused? How do you do this without make_pair or some kind of tuple struct type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the std::pair constructor of view_number is not needed, but what does that have to do with this get_digest_to_sign implementation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment chain is on get_digest_to_sign. I think we are in agreement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, my bad, sorry. For the get_digest_to_sign, can we only hash a pair and not a tuple? would be nicer to hash all 3 at once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no reason to hash them separately. Good call.

Copy link
Member

@heifner heifner Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do this:

struct hotstuff_digest_type { 
   const block_id_type& block_id; 
   uint8_t phase_counter = 0; 
   const fc::sha256& final_on_qc;

   digest_type to_digest() const { digest_type::hash( *this ); }
}; 

auto digest = hotstuff_digest_type{blk_id, phase_c, f_on_qc}.to_digest();

FC_REFLECT(eosio::chain::hotstuff_digest_type, (block_id)(phase_counter)(final_on_qc));

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The digest simplification is great but I think it's going to change all of the hotstuff unit tests since those use hardcoded hash values. If that's the case then I think we should do that in a separate issue/PR, referencing the comment with the fix: #1578 (comment)

libraries/chain/include/eosio/chain/hotstuff.hpp Outdated Show resolved Hide resolved
libraries/chain/include/eosio/chain/hotstuff.hpp Outdated Show resolved Hide resolved
libraries/chain/include/eosio/chain/hotstuff.hpp Outdated Show resolved Hide resolved
libraries/chain/include/eosio/chain/hotstuff.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/qc_chain.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/qc_chain.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/qc_chain.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/test/test_hotstuff_state.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/test/test_hotstuff_state.cpp Outdated Show resolved Hide resolved
@heifner heifner changed the title Persist hotstuff safety state IF: Persist hotstuff safety state Sep 5, 2023
- speed up digest with std::cref()
- remove view_number pair constructor
- remove view_number::get_view_number
- inline get_digest_to_sign
- remove underscore from view_number member vars
- changed to eosio::chain::config::safetydb_filename
- new state_db_manager supports read/write state file without open/close on each operation
- catch and handle all read/write errors including zero-size file and object deserialization/fc::unpack errors
- qc_chain leaves safety state file open for multiple writes (with flush)
- changes to hotstuff_tools.cpp tests to match new code
- changes to test_hotstuff_state.cpp tests to match new code
- added cfile::truncate() to help implement state file overwriting
- remove comments & dead code
- fixed hotstuff_tools.cpp & test_hotstuff_state.cpp line endings
@fcecin
Copy link

fcecin commented Sep 29, 2023

In the CI test run, plugin_test (a parallel test) is failing with a timeout. This test has been failing (hanging... I just abort it) in my local machine for some time now (for hotstuff_integration & feature branches based off of it), but was passing in CI nevertheless.

@greg7mdp

This comment was marked as outdated.

@fcecin fcecin self-assigned this Sep 29, 2023
libraries/hotstuff/qc_chain.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/state.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/state.hpp Outdated Show resolved Hide resolved
- hotstuff/state.hpp const& & return {}
- hotstuff/state.hpp fix formatting & line endings
- qc_chain.cpp const& finalizer keys iter
libraries/hotstuff/test/test_hotstuff_state.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/test/hotstuff_tools.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/test/hotstuff_tools.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/test/hotstuff_tools.cpp Outdated Show resolved Hide resolved
libraries/hotstuff/test/hotstuff_tools.cpp Outdated Show resolved Hide resolved
libraries/chain/include/eosio/chain/hotstuff.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/qc_chain.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/qc_chain.hpp Outdated Show resolved Hide resolved
libraries/hotstuff/include/eosio/hotstuff/qc_chain.hpp Outdated Show resolved Hide resolved
Comment on lines +11 to +17
void set_v_height(const fc::crypto::blslib::bls_public_key& finalizer_key, const eosio::chain::view_number v_height) {
_states[finalizer_key].first = v_height;
}

void set_b_lock(const fc::crypto::blslib::bls_public_key& finalizer_key, const fc::sha256& b_lock) {
_states[finalizer_key].second = b_lock;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right, that we set one or the other, and the other half of the pair will contain default values.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a zeroed-out hash is fine as an indicator of "we don't have one"; not sure about view number zero (bheight==0, pcounter==0).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being discussed in the comment below: #1578

Comment on lines +9 to +61
struct safety_state {

void set_v_height(const fc::crypto::blslib::bls_public_key& finalizer_key, const eosio::chain::view_number v_height) {
_states[finalizer_key].first = v_height;
}

void set_b_lock(const fc::crypto::blslib::bls_public_key& finalizer_key, const fc::sha256& b_lock) {
_states[finalizer_key].second = b_lock;
}

std::pair<eosio::chain::view_number, fc::sha256> get_safety_state(const fc::crypto::blslib::bls_public_key& finalizer_key) const {
auto s = _states.find(finalizer_key);
if (s != _states.end()) return s->second;
else return {};
}

eosio::chain::view_number get_v_height(const fc::crypto::blslib::bls_public_key& finalizer_key) const {
auto s = _states.find(finalizer_key);
if (s != _states.end()) return s->second.first;
else return {};
};

fc::sha256 get_b_lock(const fc::crypto::blslib::bls_public_key& finalizer_key) const {
auto s_itr = _states.find(finalizer_key);
if (s_itr != _states.end()) return s_itr->second.second;
else return {};
};

//todo : implement safety state default / sorting

std::pair<eosio::chain::view_number, fc::sha256> get_safety_state() const {
auto s = _states.begin();
if (s != _states.end()) return s->second;
else return {};
}

eosio::chain::view_number get_v_height() const {
auto s = _states.begin();
if (s != _states.end()) return s->second.first;
else return {};
};

fc::sha256 get_b_lock() const {
auto s_itr = _states.begin();
if (s_itr != _states.end()) return s_itr->second.second;
else return {};
};

std::map<fc::crypto::blslib::bls_public_key, std::pair<eosio::chain::view_number, fc::sha256>> _states;
};
}

FC_REFLECT(eosio::hotstuff::safety_state, (_states))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to remove the incorrect 3 bottom APIs, and make it less verbose with a couple using directives. Also should return optional so we have a clear way to test that something is not present, I think it is cleaner than relying on the empty() of sha256.
Also suggest to rename the struct safety_states as it holds the safety states for all finalizers.

Suggested change
struct safety_state {
void set_v_height(const fc::crypto::blslib::bls_public_key& finalizer_key, const eosio::chain::view_number v_height) {
_states[finalizer_key].first = v_height;
}
void set_b_lock(const fc::crypto::blslib::bls_public_key& finalizer_key, const fc::sha256& b_lock) {
_states[finalizer_key].second = b_lock;
}
std::pair<eosio::chain::view_number, fc::sha256> get_safety_state(const fc::crypto::blslib::bls_public_key& finalizer_key) const {
auto s = _states.find(finalizer_key);
if (s != _states.end()) return s->second;
else return {};
}
eosio::chain::view_number get_v_height(const fc::crypto::blslib::bls_public_key& finalizer_key) const {
auto s = _states.find(finalizer_key);
if (s != _states.end()) return s->second.first;
else return {};
};
fc::sha256 get_b_lock(const fc::crypto::blslib::bls_public_key& finalizer_key) const {
auto s_itr = _states.find(finalizer_key);
if (s_itr != _states.end()) return s_itr->second.second;
else return {};
};
//todo : implement safety state default / sorting
std::pair<eosio::chain::view_number, fc::sha256> get_safety_state() const {
auto s = _states.begin();
if (s != _states.end()) return s->second;
else return {};
}
eosio::chain::view_number get_v_height() const {
auto s = _states.begin();
if (s != _states.end()) return s->second.first;
else return {};
};
fc::sha256 get_b_lock() const {
auto s_itr = _states.begin();
if (s_itr != _states.end()) return s_itr->second.second;
else return {};
};
std::map<fc::crypto::blslib::bls_public_key, std::pair<eosio::chain::view_number, fc::sha256>> _states;
};
}
FC_REFLECT(eosio::hotstuff::safety_state, (_states))
struct safety_states {
using safety_state = std::pair<eosio::chain::view_number, fc::sha256>;
using public_key = fc::crypto::blslib::bls_public_key;
void set_v_height(const public_key& finalizer_key, const eosio::chain::view_number v_height) {
_states[finalizer_key].first = v_height;
}
void set_b_lock(const public_key& finalizer_key, const fc::sha256& b_lock) {
_states[finalizer_key].second = b_lock;
}
std::optional<safety_state> get_safety_state(const public_key& finalizer_key) const {
auto s = _states.find(finalizer_key);
if (s != _states.end())
return s->second;
return {};
}
std::optional<eosio::chain::view_number> get_v_height(const public_key& finalizer_key) const {
auto s = get_safety_state(finalizer_key);
if (s)
return s->first;
return {};
};
std::optional<fc::sha256> get_b_lock(const public_key& finalizer_key) const {
auto s = get_safety_state(finalizer_key);
if (s)
return s->second;
return {};
};
//todo : implement safety state default / sorting
std::map<public_key, safety_state> _states;
};
}
FC_REFLECT(eosio::hotstuff::safety_states, (_states))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, since these results end up in decision-making math for the protocol. We can discuss this in the meeting next week.

Copy link

@fcecin fcecin Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i) I committed this to the branch.

ii) maybe the missing optional v_height is always "less" (<) than any valid value for the sake of the protocol math. missing optional b_lock behaves like the zeroed-out/empty sha256 behaved.

iii) maybe b_lock and v_height should behave differently in the safety state record. b_lock is currently assigned simultaneously to all finalizer keys, so it looks like a singleton piece of sate for one qc_chain. v_height is attributed separately, so it would still go into a key-->v_height map.

struct safety_states {
    fc::sha256 b_lock;
    std::map<public_key, eosio::chain::view_number> v_height;
}

This would allow for safety_states::get_b_lock() to remain parameterless, which looks like what it wants to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also don't think it is quite right as it is. You probably need to discuss with @systemzax as to what is the right data structure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve the PR as is, even if it is not perfect, I think it still gives us a basis to move forward.

@@ -72,7 +97,7 @@ namespace eosio::chain {
fc::sha256 b_finality_violation;
block_id_type block_exec;
block_id_type pending_proposal_block;
uint32_t v_height = 0;
eosio::chain::view_number v_height;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this and other members should be std::optional, but we can update that later.

fcecin and others added 2 commits November 18, 2023 01:02
- Add view_number::operator<<
- Simplify and remove redundant BOOST_CHECK_xxx() calls in tests
- Magic constant added to the header of state files
- Refactor/simplify state_db_manager in qc_chain.hpp
block_id_type _block_exec;
block_id_type _pending_proposal_block;
safety_state _safety_state;
Copy link
Member

@linh2931 linh2931 Nov 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update all safety_state to safety_states to get compiled.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to revert the std::optional patch. Either we commit this as file persistence ticket of a previous safety state logic, or we roll in reviewing and fixing the safety state logic with the file persistence in this same PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can commit this as previous safety state logic after CI is passed. Then in a new PR fix the remaining issues. This way we can move forward quickly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, you can merge this. I think it is clear it is not the final version but this moves us forward.

@fcecin
Copy link

fcecin commented Nov 18, 2023

We have to decide whether we want to turn this ticket into a review of the concept of what the safety state is and how it is read and written in qc_chain. If that's what we want, we cannot merge this. If this is just about persisting our previous understanding of how to support a b_lock and v_height for multiple finalizer keys in the code, then we can merge. The optional stuff just helped to bring this issue to the fore.

EDIT: I think this ticket is less about file I/O and more about what is the correct way to handle a qc_chain having multiple finalizer keys, and what that means w.r.t. b_lock and v_height.

EDIT (2): Anyway, as discussed above, we are going to check this in as-is.

@fcecin fcecin force-pushed the persist_hs_safety_liveness_state_WIP branch from c09ba63 to c171034 Compare November 18, 2023 20:11
@fcecin fcecin merged commit cedbc51 into hotstuff_integration Nov 19, 2023
29 checks passed
@fcecin fcecin deleted the persist_hs_safety_liveness_state_WIP branch November 19, 2023 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IF: Safely persist HotStuff state
5 participants