Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IF: Implement message propagation #1676

Merged
merged 7 commits into from
Nov 16, 2023
Merged

Conversation

fcecin
Copy link

@fcecin fcecin commented Sep 25, 2023

  • propagate proposals if they are added by the node to its proposal store
  • non-leaders propagate votes of known proposals that haven't been propagated by them yet
  • added structure to track seen votes for each proposal (GC'd together with proposals by height)
  • propagate new view messages that have a High QC that's newer than what we had locally
  • enabled basic propagation test
  • added star topology propagation test
  • added ring topology propagation test
  • documented that new block messages are not propagated
  • removed topology restriction for new block messages at test_pacemaker
  • removed hop count from hotstuff_test_handler::dispatch(); will dispatch until network goes quiet

Merging both multi-indexes declared in the qc_chain is possible (makes proposal height gc 2x faster) but I'd leave that for later to minimize merge conflicts.

Remaning issues (as argued at the tail end of #1548) should be addressed in new issues and PRs, such as dealing with connection grief.

NOTE: this is still using eosio-name finalizers, not finalizer keys, but it does not add too much workload for the incoming code to change the added eosio-name code to finalizer key code. (EDIT: This is solved in the latest merge from hotstuff_integration into this branch (Oct. 2nd 2023))

Closes #1548.

===============================================

Follows the argument as to why this should close #1548 (multi-hop message propagation):

First, it has to be established that a finalizer key in the finalizer set can, in theory, produce an arbitrary number of messages with varied content, which can bloat any structure that is created to track unique messages being propagated in the network. This (in theory) produces network spam, but, crucially, it can crash nodes by draining memory OR cause nodes to drop legitimate messages because they can no longer track duplicates (a bad solution to constraining duplicate-msg-tracking memory) OR cause nodes to circulate massive spam until the finalizer is removed. This has to be addressed by a message propagation solution.

Second, there's the issue of whether to support "any" node with "any" state (i.e. a blank node) as a message relayer or not.

There are only two valid solutions to message propagation.

Solution 1 (previously rejected) involves allocating a fixed amount of memory on each node to track message propagation by hash of raw byte content. This is basically valid (i.e. works) because although there are scenarios where a finalizer that is in the finalizer set could spam the network with authentic messages (i.e. the message IS in fact signed by that finalizer) but that otherwise are garbage, (EDIT:) the memory consumed by tracking that garbage is bounded, and propagation of honest messages still happen, even though the network could in theory start to suffer from re-propagation of the same messages.

Solution 1 would be more geared towards a scenario where you'd want to support any node to be a relayer, irrespective of their state. A Solution 1 to message propagation would be one where you'd find a unit test that has a star topology where the node at the center of the star is killed and rebooted midway the test with an empty qc_chain, and propagation still works. We will argue below that this is NOT a scenario that we even want to support, i.e. we want to actively ensure that this does NOT happen even.

Solution 2 (in this PR) is the opposite: message propagation logic is tightly bound to the protocol itself, and we only propagate messages that can be proven are relevant to advancing the local qc_chain's state. In this scenario, a proposal is only propagated if it is inserted in our proposal store; a vote is only propagated for proposals that we know about AND if we haven't propagated that exact vote before; and a new view is only propagated if the High QC that it contained was actually a relevant update to us. These are conservative criteria, and we would only make them more conservative than this, if we'd knew how.

Solution 2 provides the approach that can (eventually) guarantee that the amount of memory spent to track duplicate messages is bounded. The amount of extra memory required to track duplicate proposals and duplicate new view messages is zero. We can easily prove that new view messages being propagated only when relevant will eventually converge, and that all nodes will receive the high QC given reliable message delivery at every hop (and even if that fails, it doesn't affect safety). We can also prove that proposal propagation from the rightful leader will reach every node; if the algorithm is correct, then inserting the proposal into one's store works as the criteria for proposal propagation. And finally, votes use as much memory as proposals and are garbage collected at the same time and under the same criteria, so if the algorithm is already correct in that regard for bounded memory consumption by the proposal store itself, then the memory consumed by the seen-votes tracking structure is also bounded.

We can also make the case that we do not ever want "dumb relayers." The typical use case for Antelope is with a structured overlay finalizer(A) <-> relay(A) <-> relay(B) <-> finalizer(B), which means relays are part of BP/finalizer infrastructure. Just as a finalizer needs to be in sync to be able to finalize, its associated relay(s) also need(s) to be in sync. A finalizer that is not yet in sync is valid, what that means is that there's one less finalizer voting for a few minutes, but as long as there are not f finalizers syncing in that manner (or g+h failing finalizers, where g+h >= f, g are finalizers syncing and thus temporarily "failing", and h are actually perma-failed finalizers) then liveness is not affected. So, the finalizer machine and the relay machine are both conceptually part of the finalizer node, and it is a deployment requirement that both must be in sync for the "finalizer" entity as a whole to work and not be actively counting towards f.

Finally, this patch does not address connection grief, such as receiving a stream of dummy messages or invalid signatures from a peer. It is possible that we do not even want to address these cases, but this is to be determined later. It is possible that we don't even want to track simply discarded (irrelevant) messages because (a) they are a natural outcome of the protocol and (b) detecting "a million irrelevant messages per second" is an attack you can detect even at network monitoring level / firewall level, and perhaps shouldn't even be the responsibility of nodeos. As for invalid signatures, a case can be made for logging those and if there's too many of them (e.g. 100 lifetime) it is possible that adding some code to disconnect the peer in those cases would make sense and pay off, even if the attack vector is pretty esoteric. In any case, message propagation would not happen and a node would not be ever leveraged to relay spam (especially under Solution 2).

EDIT: Non-leaders propagating votes only for known proposals can be justified by the protocol being a complete flood-fill of the network. This is a gossip protocol with a fanout probability of 100%, which both works if given an arbitrary topology, and is particularly fine in the typical (expected) topology where there's a ring of BP relayers that are fully connected to each other at the middle of the topology, and one connection between the relayer and its backing finalizer (maybe a few more connections inside the BP infrastructure, but no crazy fanouts). Given that a gossip 100% protocol minimizes latency by definition, it can be assumed that when a proposal is sent by the leader-finalizer, that it will reach every voter-finalizer with minimal latency. Since a vote has to exit the finalizer through its relay to reach the other finalizer's relays, it can be argued that at least in the expected (regular) topology, it is extremely likely that a vote by a finalizer will only reach the relays of other finalizers and other finalizers after they have seen the proposal. This is given by the expected high connectivity between all relays at the core infrastructure, which minimize the latency of a proposal reaching every relay irrespective of the latency between the relay of the finalizer sending the proposal and any other relay, as the proposal will find its way by the broadcast done by every other relay. And finally, dropping votes (which should not usually happen) because you haven't seen the proposal yet is equivalent to dropping messages in transit, which should not affect the safety of any correct SMR protocol. And finally, it should not affect liveness either, as you still need to fail f finalizers to begin to cause any trouble.

- propagate proposals if they are added by the node to its proposal store
- non-leaders propagate votes of known proposals that haven't been propagated by them yet
- added structure to track seen votes for each proposal (GC'd together with proposals by height)
- propagate new view messages that have a High QC that's newer than what we had locally
- enabled basic propagation test
- added star topology propagation test
- added ring topology propagation test
- documented that new block messages are not propagated
- removed topology restriction for new block messages at test_pacemaker
- removed hop count from hotstuff_test_handler::dispatch(); will dispatch until network goes quiet

Merging both multi-indexes declared in the qc_chain is possible (makes proposal height gc 2x faster) but I'd leave that for later to minimize merge conflicts.
@fcecin
Copy link
Author

fcecin commented Sep 25, 2023

I've run the NP tests here, and before my machine crashed, I got four failing NP tests:

nodeos_snapshot_diff_test
nodeos_snapshot_forked_test
trx_finality_status_test
trx_finality_status_forked_test

I think my machine crashed while running nodeos_contrl_c_test but I am not sure.

Inspecting the TestLogs directory (under build) I can see:

drwxrwxr-x 7 fcecin fcecin 4096 set 25 12:46 nodeos_contrl_c_test109845
drwxrwxr-x 8 fcecin fcecin 4096 set 25 12:34 nodeos_snapshot_diff_test107975
drwxrwxr-x 7 fcecin fcecin 4096 set 25 12:38 nodeos_snapshot_forked_test108251
drwxrwxr-x 7 fcecin fcecin 4096 set 25 12:42 trx_finality_status_forked_test108793
drwxrwxr-x 9 fcecin fcecin 4096 set 25 12:39 trx_finality_status_test108495

In those test directories I can see enormous stderr.txt files:

-rw-rw-r-- 1 fcecin fcecin 2285977839 set 25 12:53 stderr.2023_09_25_12_45_54.txt
-rw-rw-r-- 1 fcecin fcecin 2220130758 set 25 12:53 stderr.2023_09_25_12_45_54.txt
-rw-rw-r-- 1 fcecin fcecin 2227108579 set 25 12:53 stderr.2023_09_25_12_45_54.txt
-rw-rw-r-- 1 fcecin fcecin 2260320205 set 25 12:53 stderr.2023_09_25_12_45_54.txt
-rw-rw-r-- 1 fcecin fcecin 1121172166 set 25 12:37 stderr.2023_09_25_12_34_26.txt
-rw-rw-r-- 1 fcecin fcecin 999252692 set 25 12:37 stderr.2023_09_25_12_34_26.txt
-rw-rw-r-- 1 fcecin fcecin 1105952595 set 25 12:37 stderr.2023_09_25_12_34_26.txt
-rw-rw-r-- 1 fcecin fcecin 1035742287 set 25 12:37 stderr.2023_09_25_12_34_26.txt
-rw-rw-r-- 1 fcecin fcecin 1014447595 set 25 12:37 stderr.2023_09_25_12_34_26.txt
-rw-rw-r-- 1 fcecin fcecin 429666233 set 25 12:39 stderr.2023_09_25_12_37_54.txt
-rw-rw-r-- 1 fcecin fcecin 429139700 set 25 12:39 stderr.2023_09_25_12_37_54.txt
-rw-rw-r-- 1 fcecin fcecin 432212661 set 25 12:39 stderr.2023_09_25_12_37_54.txt
-rw-rw-r-- 1 fcecin fcecin 427703331 set 25 12:39 stderr.2023_09_25_12_37_54.txt
-rw-rw-r-- 1 fcecin fcecin 485282833 set 25 12:43 stderr.2023_09_25_12_41_59.txt
-rw-rw-r-- 1 fcecin fcecin 493437896 set 25 12:43 stderr.2023_09_25_12_41_59.txt
-rw-rw-r-- 1 fcecin fcecin 490242077 set 25 12:43 stderr.2023_09_25_12_41_59.txt
-rw-rw-r-- 1 fcecin fcecin 488229323 set 25 12:43 stderr.2023_09_25_12_41_59.txt
-rw-rw-r-- 1 fcecin fcecin 589996927 set 25 12:41 stderr.2023_09_25_12_39_27.txt
-rw-rw-r-- 1 fcecin fcecin 656235361 set 25 12:41 stderr.2023_09_25_12_39_27.txt
-rw-rw-r-- 1 fcecin fcecin 674557865 set 25 12:41 stderr.2023_09_25_12_39_27.txt
-rw-rw-r-- 1 fcecin fcecin 673739911 set 25 12:41 stderr.2023_09_25_12_39_27.txt
-rw-rw-r-- 1 fcecin fcecin 641266659 set 25 12:41 stderr.2023_09_25_12_39_27.txt
-rw-rw-r-- 1 fcecin fcecin 648319963 set 25 12:41 stderr.2023_09_25_12_39_27.txt

They are full of entries like this one:

debug 2023-09-25T15:34:56.460 net-0     net_plugin.cpp:1129           operator()           ] ["localhost:9878 - 06a07ef" - 6 127.0.0.1:53240] handle hs_message
debug 2023-09-25T15:34:56.460 net-0     net_plugin.cpp:3574           handle_message       ] ["localhost:9878 - 06a07ef" - 6 127.0.0.1:53240] received hs: [3,{"high_qc":{"proposal_id":"0000000000000000000000000000000000000000000000000000000000000000","active_finalizers":[0],"active_agg_sig":"SIG_BLS_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANqye6g=="}}]
debug 2023-09-25T15:34:56.460 net-0     net_plugin.cpp:3830           bcast_hs_message     ] sending hs msg: [3,{"high_qc":{"proposal_id":"0000000000000000000000000000000000000000000000000000000000000000","active_finalizers":[0],"active_agg_sig":"SIG_BLS_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANqye6g=="}}]

With message propagation, I don't think that, in general, we want tests to be logging each hotstuff message. There's a whole lot of them going on, depending on what the test topology looks like.

This looks like it is exposing a weakness in some of the tests, instead of the tests exposing some vulnerability in the hotstuff implementation and/or this PR. (EDIT:) It could also be a consequence of the hotstuff integration being midway.

@heifner
Copy link
Member

heifner commented Sep 25, 2023

This looks like it is exposing a weakness in some of the tests, instead of the tests exposing some vulnerability in the hotstuff implementation and/or this PR. (EDIT:) It could also be a consequence of the hotstuff integration being midway.

It is a lot of logging, but for now I think it is ok. We probably want to move them to their own logger so we can easily toggle them on/off.

For now, I think they actually show the issue. Take for example the attached snip of the log of a failing test. Seems to be caught in an infinite loop.

debug 2023-09-25T144744.661 net-0 .txt

@fcecin
Copy link
Author

fcecin commented Sep 25, 2023

This looks like it is exposing a weakness in some of the tests, instead of the tests exposing some vulnerability in the hotstuff implementation and/or this PR. (EDIT:) It could also be a consequence of the hotstuff integration being midway.

It is a lot of logging, but for now I think it is ok. We probably want to move them to their own logger so we can easily toggle them on/off.

For now, I think they actually show the issue. Take for example the attached snip of the log of a failing test. Seems to be caught in an infinite loop.

debug 2023-09-25T144744.661 net-0 .txt

It's the new-view message propagation code. if I comment it out, the four failed NP tests pass. My blind assumption about how update_high_qc() works is wrong.

Well, if any one of the propagated messages was going to loop, it was this one.

@fcecin
Copy link
Author

fcecin commented Sep 25, 2023

Not sure what's with the LR auto_bp_peering_test failing on Ubuntu 22 only (CI) with this snippet:

2023-09-25T17:21:04.363740             Could not find transaction with id 5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45, delay and retry
2023-09-25T17:21:07.367197              cmd: /__w/leap/leap/build/bin/cleos --url http://localhost:8788  --wallet-url http://localhost:9899 --no-auto-keosd get transaction_trace 5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45
2023-09-25T17:21:08.636169             Could not find transaction with id 5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45, delay and retry
2023-09-25T17:21:11.643413              cmd: /__w/leap/leap/build/bin/cleos --url http://localhost:8788  --wallet-url http://localhost:9899 --no-auto-keosd get transaction_trace 5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45
2023-09-25T17:21:12.613698             Could not find transaction with id 5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45, delay and retry
2023-09-25T17:21:15.636739              cmd: /__w/leap/leap/build/bin/cleos --url http://localhost:8788  --wallet-url http://localhost:9899 --no-auto-keosd get transaction_trace 5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45
2023-09-25T17:21:16.567952               FAILURE - Exception during "get transaction_trace". Exception message: Error 3200005: http request fail
Error Details:
Error code 404
: {"code":404,"message":"Trace API: transaction id missing in the transaction id log files","error":{"code":0,"name":"","what":"","details":[]}}

.  stdout: .  cmd Duration=0.930 sec. Context: (transaction id=5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45)
2023-09-25T17:21:16.568674               ERROR: Exception during "get transaction_trace". Exception message: Error 3200005: http request fail
Error Details:
Error code 404
: {"code":404,"message":"Trace API: transaction id missing in the transaction id log files","error":{"code":0,"name":"","what":"","details":[]}}

.  stdout: .  cmd Duration=0.930 sec. Context: (transaction id=5db04c74e682ffad150ef7ccbe13f44560d800c3f60131ac91da416c32ce4a45)

I ran the test locally and it passes, actually.

@arhag arhag linked an issue Oct 2, 2023 that may be closed by this pull request
@@ -952,6 +988,12 @@ namespace eosio::hotstuff {
void qc_chain::gc_proposals(uint64_t cutoff){
//fc_tlog(_logger, " === garbage collection on old data");

auto sv_end_itr = _seen_votes_store.get<by_seen_votes_proposal_height>().upper_bound(cutoff);
while (_seen_votes_store.get<by_seen_votes_proposal_height>().begin() != sv_end_itr){
Copy link
Member

@linh2931 linh2931 Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use range erase to avoid the while loop for simplicity and potential invalid itrs doing the loop, something like
_seen_votes_store.get<by_seen_votes_proposal_height>().erase(sv_begin_itr, sv_end_itr);

@@ -662,7 +693,12 @@ namespace eosio::hotstuff {
_b_leaf = _high_qc.get_proposal_id();

fc_tlog(_logger, " === ${id} _b_leaf updated (update_high_qc) : ${proposal_id}", ("proposal_id", _high_qc.get_proposal_id())("id", _id));
return true;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the return type or name of bool qc_chain::update_high_qc should be changed. Even though you updated _high_qc but returns false if proposal_id() is empty. It is hard to follow in the caller process_new_view why false indicates the view should be propagated. Or use another dedicated method to decide whether to propagate. This might have prevented the initial infinite loop bug.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best is to return whether it was updated, and only strictly update when we are going to propagate. For that I need some help in formalizing the edge cases where e.g. the proposal_id() is empty.

@heifner heifner self-requested a review November 9, 2023 14:03
@fcecin fcecin merged commit ad5ba52 into hotstuff_integration Nov 16, 2023
29 checks passed
@fcecin fcecin deleted the hs_1548_propag_msgs branch November 16, 2023 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IF: propagate hs_*_messages
4 participants