-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support throttling block syncing to peers. #1540
Conversation
Add necessary custom topology for p2p_sync_throttle_test.
Clarify variable names in p2p throttled sync test and tweak numbers. Fix p2p throttled test to actually function (waitForBlock has a hidden default timeout). Bump up timeout in block_log_util_test.
Remove exponential backoff in throttle and utilize existing retry mechanism.
plugins/net_plugin/net_plugin.cpp
Outdated
block_sync_rate_limit = boost::numeric_cast<size_t>(limit * prefix_multipliers.at(units_match[1].str())); | ||
fc_dlog( logger, "setting block_sync_rate_limit to ${limit}", ("limit", block_sync_rate_limit)); | ||
} catch (boost::numeric::bad_numeric_cast&) { | ||
EOS_ASSERT(false, plugin_config_exception, "block sync limit specification overflowed: ${limit}", ("limit", limit_str)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use EOS_THROW
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted to EOS_THROW
.
tests/p2p_sync_throttle_test.py
Outdated
|
||
throttlingNode = cluster.unstartedNodes[0] | ||
i = throttlingNode.cmd.index('--p2p-listen-endpoint') | ||
throttlingNode.cmd[i+1] = throttlingNode.cmd[i+1] + ':40000B/s' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment why are you using 40000 bytes here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if avg block size is 10Kb and throttling size is 40Kb/s how do slow down transmission if you just need 20Kb/s for full speed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the question. This is an integration test, not a user scenario. If the test rate needs to be changed from 40k to 20k for some reason, we simply change it. It's a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question is based on your comment. numbers are from your comment. you chose the numbers so I asked you to explain those.
tests/p2p_sync_throttle_test.py
Outdated
endThrottledSync = time.time() | ||
Print(f'Unthrottled sync time: {endThrottlingSync - clusterStart} seconds') | ||
Print(f'Throttled sync time: {endThrottledSync - clusterStart} seconds') | ||
assert endThrottledSync - clusterStart > endThrottlingSync - clusterStart + 15, 'Throttled sync time must be at least 15 seconds greater than unthrottled' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment with explanation of how did you calculate those 15 seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting something like (avg_block_size * block_amount) / throttle_size_per_sec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to give the illusion of precision where there is none. The time to synchronize varies significantly depending on the machine running the test. I tried to choose values which are reasonable for a wide range of machines in order to make the test as reliable as possible, but that necessarily involves some guesswork. Unfortunately the actual block sizes are not available to the test framework so I can't implement my preferred solution and just calculate values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I understand you point but that makes this not a real test but guessed numbers that make this script pass. I understand that you can't do this precisely but if you can think of any approximate formula it would be nice. e.g. you can roughly check blocks.log size and divide this by number of blocks or you can take it from network stats that node supposed to print, to include all messages and serialized data sizes.
Otherwise, imagine this test fails. How do I know if that is error or just block size has changed or network issues? To solve this I can play with numbers to make it pass but I still have no idea if I'm not hiding some bug that way.
Fix parsing and overflow problems and address peer review comments. Extend throttle test to add another throttle prefix.
Added additional code comments. Addressed peer review comment.
Delegate reconnecting back to connections_manager rather than have connection try to do it itself.
…peers." This reverts commit df6d948.
Split prometheus statistics out of connection_monitor into connection_statistics_monitor.
plugins/net_plugin/net_plugin.cpp
Outdated
|
||
size_t net_plugin_impl::parse_connection_rate_limit( const std::string& limit_str) const { | ||
std::istringstream in(limit_str); | ||
fc_dlog( logger, "parsing connection endpoint limit ${limit} with locale ${l}", ("limit", limit_str)("l", std::locale("").name())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't parsing the config in a locale dependent way mean config files aren't transportable across different systems (when they have a different locale)? I can imagine some confusion from users when copying a sample config file and it errors out on their system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed locale-awareness.
Remove dependency on python requests package. Remove locale-aware parsing of sync throttle rate. Prevent transmitting peer from throttling while not in sync mode. Add timeouts to throttle sync test.
Peer to peer listen ports may now apply a block sync throttle. Each occurrence of
p2p-listen-endpoint
has an independent throttle specification. Here's the updated help text:Parsing of the rate cap accepts fractional numbers expressed as decimals. The throttle rate is in bytes per second. Per the help text, shorthand suffixes are supported, and transactions and blocks being propagated across the peer to peer network between synchronized nodes are not throttled. Only blocks transmitted to a syncing node are throttled. A throttle of
0
or0B/s
means unthrottled, and is the default.The limit for specifying a rate cap is 30 characters, including the colon field separator. This is sufficient characters to specify any value in the 64 bit range of
size_t
with room left for a suffix. If the rate cap does not parse or the value multiplied by the suffix exceeds 18,446,744,073,709,551,615 an exception will be thrown and the node will not start.IPv6 addresses are supported for listen endpoints. They must be in square bracket format:
[<ipv6 address>]:port
optionally followed by:<rate-cap>
.Note
Inbound connections from IPs which are configured
p2p-peer-address
es are exempt from throttle rate caps. Care should be taken in NAT environments to avoid inadvertently exempting connections due to overlapping subnets.Throttling is stable even at exceptionally low byterates, but of course on a busy network if the throttle is less than the average block size on the network, clients using that listen port will never catch up to the head block. Such configurations are allowed but not recommended.
Suggested network topology
Together with #1411 which allows multiple listen endpoints, an edge node for public peering might be configured as follows:
Resolves #1295.