Support throttling block syncing to peers. #1540

jgiszczak · 2023-08-22T08:17:41Z

Peer to peer listen ports may now apply a block sync throttle. Each occurrence of p2p-listen-endpoint has an independent throttle specification. Here's the updated help text:

--p2p-listen-endpoint arg (=0.0.0.0:9876:0)
                                        The actual host:port[:<rate-cap>] used
                                        to listen for incoming p2p connections.
                                        May be used multiple times.   The
                                        optional rate cap will limit block sync
                                        bandwidth to the specified rate.  A
                                        number alone will be interpreted as
                                        bytes per second.  The number may be
                                        suffixed with units.  Supported units
                                        are:   'B/s', 'KB/s', 'MB/s, 'GB/s',
                                        'TB/s', 'KiB/s', 'MiB/s', 'GiB/s',
                                        'TiB/s'.  Transactions and blocks
                                        outside of sync mode are not throttled.
                                          Examples:
                                            192.168.0.100:9876:1MiB/s
                                            node.eos.io:9876:1512KB/s
                                            node.eos.io:9876:0.5GB/s
                                            [2001:db8:85a3:8d3:1319:8a2e:370:7348]:9876:250KB/s

Parsing of the rate cap accepts fractional numbers expressed as decimals. The throttle rate is in bytes per second. Per the help text, shorthand suffixes are supported, and transactions and blocks being propagated across the peer to peer network between synchronized nodes are not throttled. Only blocks transmitted to a syncing node are throttled. A throttle of 0 or 0B/s means unthrottled, and is the default.

The limit for specifying a rate cap is 30 characters, including the colon field separator. This is sufficient characters to specify any value in the 64 bit range of size_t with room left for a suffix. If the rate cap does not parse or the value multiplied by the suffix exceeds 18,446,744,073,709,551,615 an exception will be thrown and the node will not start.

IPv6 addresses are supported for listen endpoints. They must be in square bracket format: [<ipv6 address>]:port optionally followed by :<rate-cap>.

Note

Inbound connections from IPs which are configured p2p-peer-addresses are exempt from throttle rate caps. Care should be taken in NAT environments to avoid inadvertently exempting connections due to overlapping subnets.

Throttling is stable even at exceptionally low byterates, but of course on a busy network if the throttle is less than the average block size on the network, clients using that listen port will never catch up to the head block. Such configurations are allowed but not recommended.

Suggested network topology

Together with #1411 which allows multiple listen endpoints, an edge node for public peering might be configured as follows:

Resolves #1295.

Add necessary custom topology for p2p_sync_throttle_test.

plugins/net_plugin/net_plugin.cpp

tests/p2p_sync_throttle_test.py

tests/p2p_sync_throttle_test_shape.json

plugins/net_plugin/net_plugin.cpp

Clarify variable names in p2p throttled sync test and tweak numbers. Fix p2p throttled test to actually function (waitForBlock has a hidden default timeout). Bump up timeout in block_log_util_test.

plugins/net_plugin/net_plugin.cpp

tests/p2p_sync_throttle_test.py

Remove exponential backoff in throttle and utilize existing retry mechanism.

plugins/net_plugin/net_plugin.cpp

dimas1185 · 2023-08-29T02:30:11Z

plugins/net_plugin/net_plugin.cpp

+            block_sync_rate_limit = boost::numeric_cast<size_t>(limit * prefix_multipliers.at(units_match[1].str()));
+            fc_dlog( logger, "setting block_sync_rate_limit to ${limit}", ("limit", block_sync_rate_limit));
+         } catch (boost::numeric::bad_numeric_cast&) {
+            EOS_ASSERT(false, plugin_config_exception, "block sync limit specification overflowed: ${limit}", ("limit", limit_str));


use EOS_THROW instead

Converted to EOS_THROW.

dimas1185 · 2023-08-29T02:37:25Z

tests/p2p_sync_throttle_test.py

+
+    throttlingNode = cluster.unstartedNodes[0]
+    i = throttlingNode.cmd.index('--p2p-listen-endpoint')
+    throttlingNode.cmd[i+1] = throttlingNode.cmd[i+1] + ':40000B/s'


Please comment why are you using 40000 bytes here

Added comment.

so if avg block size is 10Kb and throttling size is 40Kb/s how do slow down transmission if you just need 20Kb/s for full speed?

I don't understand the question. This is an integration test, not a user scenario. If the test rate needs to be changed from 40k to 20k for some reason, we simply change it. It's a test.

question is based on your comment. numbers are from your comment. you chose the numbers so I asked you to explain those.

dimas1185 · 2023-08-29T02:38:13Z

tests/p2p_sync_throttle_test.py

+    endThrottledSync = time.time()
+    Print(f'Unthrottled sync time: {endThrottlingSync - clusterStart} seconds')
+    Print(f'Throttled sync time: {endThrottledSync - clusterStart} seconds')
+    assert endThrottledSync - clusterStart > endThrottlingSync - clusterStart + 15, 'Throttled sync time must be at least 15 seconds greater than unthrottled'


please add comment with explanation of how did you calculate those 15 seconds

Added comment.

I was expecting something like (avg_block_size * block_amount) / throttle_size_per_sec.

I don't want to give the illusion of precision where there is none. The time to synchronize varies significantly depending on the machine running the test. I tried to choose values which are reasonable for a wide range of machines in order to make the test as reliable as possible, but that necessarily involves some guesswork. Unfortunately the actual block sizes are not available to the test framework so I can't implement my preferred solution and just calculate values.

well, I understand you point but that makes this not a real test but guessed numbers that make this script pass. I understand that you can't do this precisely but if you can think of any approximate formula it would be nice. e.g. you can roughly check blocks.log size and divide this by number of blocks or you can take it from network stats that node supposed to print, to include all messages and serialized data sizes.
Otherwise, imagine this test fails. How do I know if that is error or just block size has changed or network issues? To solve this I can play with numbers to make it pass but I still have no idea if I'm not hiding some bug that way.

tests/p2p_sync_throttle_test.py

Fix parsing and overflow problems and address peer review comments. Extend throttle test to add another throttle prefix.

plugins/net_plugin/net_plugin.cpp

Added additional code comments. Addressed peer review comment.

Delegate reconnecting back to connections_manager rather than have connection try to do it itself.

…peers." This reverts commit df6d948.

Split prometheus statistics out of connection_monitor into connection_statistics_monitor.

spoonincode · 2023-10-04T17:54:31Z

plugins/net_plugin/net_plugin.cpp

+
+   size_t net_plugin_impl::parse_connection_rate_limit( const std::string& limit_str) const {
+      std::istringstream in(limit_str);
+      fc_dlog( logger, "parsing connection endpoint limit ${limit} with locale ${l}", ("limit", limit_str)("l", std::locale("").name()));


Wouldn't parsing the config in a locale dependent way mean config files aren't transportable across different systems (when they have a different locale)? I can imagine some confusion from users when copying a sample config file and it errors out on their system.

Removed locale-awareness.

Remove dependency on python requests package. Remove locale-aware parsing of sync throttle rate. Prevent transmitting peer from throttling while not in sync mode. Add timeouts to throttle sync test.

Replaced by #1741

heifner · 2023-10-10T21:23:23Z

Replaced by #1741 & #1742

Support throttling block syncing to peers. WIP

e68743a

jgiszczak added the OCI Work exclusive to OCI team label Aug 22, 2023

jgiszczak marked this pull request as draft August 22, 2023 08:56

Add exponential backoff to throttle. Fix wretched math.

303c3d6

Add necessary custom topology for p2p_sync_throttle_test.

jgiszczak marked this pull request as ready for review August 23, 2023 06:15

Merge branch 'main' into p2p-peer-throttle

3789d17

jgiszczak changed the title ~~Support throttling block syncing to peers. WIP~~ Support throttling block syncing to peers. Aug 23, 2023

heifner requested changes Aug 23, 2023

View reviewed changes

BenjaminGormanPMP requested review from greg7mdp and linh2931 August 23, 2023 21:01

heifner reviewed Aug 24, 2023

View reviewed changes

plugins/net_plugin/net_plugin.cpp Outdated Show resolved Hide resolved

jgiszczak added 3 commits August 24, 2023 14:07

Experiment: How many tests fail if waitForObj default times out

70b530b

Address review comments in net_plugin.

b92d84c

Clarify variable names in p2p throttled sync test and tweak numbers. Fix p2p throttled test to actually function (waitForBlock has a hidden default timeout). Bump up timeout in block_log_util_test.

Further tweak the sync throttle test for machines faster than mine.

dc54d46

heifner reviewed Aug 25, 2023

View reviewed changes

plugins/net_plugin/net_plugin.cpp Outdated Show resolved Hide resolved

heifner reviewed Aug 25, 2023

View reviewed changes

tests/p2p_sync_throttle_test.py Show resolved Hide resolved

jgiszczak added 2 commits August 25, 2023 15:23

Move block sync throttling to the correct layer in the call stack.

3a50864

Remove exponential backoff in throttle and utilize existing retry mechanism.

Move block sync rate limit parsing to plugin initialize.

e1c1d42

heifner requested changes Aug 26, 2023

View reviewed changes

BenjaminGormanPMP requested review from dimas1185 and removed request for linh2931 and greg7mdp August 28, 2023 21:02

dimas1185 reviewed Aug 29, 2023

View reviewed changes

tests/p2p_sync_throttle_test.py Outdated Show resolved Hide resolved

Require IPv6 addresses to be in square bracket format.

92e4e7c

Fix parsing and overflow problems and address peer review comments. Extend throttle test to add another throttle prefix.

heifner reviewed Aug 30, 2023

View reviewed changes

plugins/net_plugin/net_plugin.cpp Outdated Show resolved Hide resolved

jgiszczak added 2 commits August 30, 2023 16:05

Added throttle exception for configured p2p-peer-addresses.

28bb38d

Added additional code comments. Addressed peer review comment.

Merge branch 'main' into p2p-peer-throttle

1eb1e44

jgiszczak and others added 14 commits September 27, 2023 15:49

Break encapsulation less.

2f80663

Delegate reconnecting back to connections_manager rather than have connection try to do it itself.

Thread safety.

7019b65

Revert "Restore lock of connections mutex when connecting configured …

4d136e3

…peers." This reverts commit df6d948.

Restore lock of connections mutex when connecting configured peers.

3708418

Accept suggested refactoring.

4baec72

Remove some unused machine-generated variables from custom shape file.

8d2c1c2

Convert connections mutex to resursive_mutex and update locks.

7e37de1

Split prometheus statistics out of connection_monitor into connection_statistics_monitor.

Revert mutex and lock type changes.

669ed0f

Revise connection_monitor for thread safety.

a6f7761

Merge branch 'main' into p2p-peer-throttle

2ef3a6c

Misc cleanups

7acac0c

Add block sync bytes received metric and use it in sync throttle test.

ff7a8a1

Add requests module for test.

6b2fe63

Merge branch 'main' into p2p-peer-throttle

a6ed57d

spoonincode reviewed Oct 4, 2023

View reviewed changes

bhazzard added this to the Leap v6.0.0 Cusp milestone Oct 4, 2023

jgiszczak added 5 commits October 5, 2023 17:33

Merge branch 'main' into p2p-peer-throttle

d1ad2cf

Add throttling flag to Prometheus peer data and use it in sync test.

1e5b427

Remove dependency on python requests package. Remove locale-aware parsing of sync throttle rate. Prevent transmitting peer from throttling while not in sync mode. Add timeouts to throttle sync test.

Revise for better repeatability.

db34bbf

Customize plugin_config_exception handling in net_plugin.

6db4ad8

Merge branch 'main' into p2p-peer-throttle

d8610c6

heifner previously approved these changes Oct 9, 2023

View reviewed changes

jgiszczak added 2 commits October 9, 2023 11:01

Merge branch 'main' into p2p-peer-throttle

5576604

Merge branch 'main' into p2p-peer-throttle

b41dd96

heifner changed the base branch from main to release/5.0 October 10, 2023 17:28

heifner changed the base branch from release/5.0 to main October 10, 2023 17:28

jgiszczak mentioned this pull request Oct 10, 2023

[5.0] Support throttling block syncing to peers #1741

Merged

heifner closed this Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support throttling block syncing to peers. #1540

Support throttling block syncing to peers. #1540

jgiszczak commented Aug 22, 2023 •

edited

Loading

dimas1185 Aug 29, 2023

jgiszczak Aug 30, 2023

dimas1185 Aug 29, 2023

jgiszczak Aug 30, 2023

dimas1185 Sep 5, 2023

jgiszczak Sep 13, 2023

dimas1185 Sep 18, 2023

dimas1185 Aug 29, 2023

jgiszczak Aug 30, 2023

dimas1185 Sep 5, 2023

jgiszczak Sep 11, 2023

dimas1185 Sep 18, 2023

spoonincode Oct 4, 2023

jgiszczak Oct 6, 2023

heifner commented Oct 10, 2023

Support throttling block syncing to peers. #1540

Support throttling block syncing to peers. #1540

Conversation

jgiszczak commented Aug 22, 2023 • edited Loading

Suggested network topology

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heifner commented Oct 10, 2023

jgiszczak commented Aug 22, 2023 •

edited

Loading