-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
Parachain collators support the `/block-announces/1` notification protocol but do not support the `/sync/2` request-response protocol. These peers are right now treated as any other full node and they can be selected for a block request. Normally the block request would be sent and since the peer doesn't support the request-response protocol, libp2p would fail to negotiate the protocol and this failure would be propagated to `SyncingEngine`, causing it to remove the peer and send the block request to some other peer. A recent change in libp2p has modified how negotiation works for streams using `V1Lazy` and for Polkadot it means that the negotiation failure is not detected and `SyncingEngine` is not notified. If a parachain collator is selected for a block request, this will cause syncing to stall because after `MAX_AHEAD_BLOCKS` have been downloaded, the syncing will stop itself from downloading more blocks until all in-flight requests have been completed and since the collator cannot answer block requests and `SyncingEngine` is not notified of the negotiation failure, syncing halts at 0.0 bps and the node has to be restarted.
Comparing our request-response code with the
I don't know why All that said, I tried running a node with that line commented out and while it worked better than without anything, it still got stuck after a few hours whereas disabling |
bot rebase |
Rebased |
I am assuming that you are referring to libp2p/rust-libp2p#4019, correct?
Negotiation failure will not be detected on |
That seems to be the case, yes. This message is printed: but otherwise we don't get any signal back to the syncing subsystem and syncing stalls. FWIW, we shouldn't add parachain collators as syncing peers in the first place since they don't support the request-response protocol but this change seems to cause trouble for us whereas before this release it worked fine. With the latest libp2p update we're also experienced issues with Kademlia so we've decided to revert the upgrade temporarily and look into the issues in closer detail when time permits. |
Superseded by #14722 |
Parachain collators support the
/block-announces/1
notification protocol but do not support the/sync/2
request-response protocol. These peers are right now treated as any other full node and they can be selected for a block request. Normally the block request would be sent and since the peer doesn't support the request-response protocol, libp2p would fail to negotiate the protocol and this failure would be propagated toSyncingEngine
, causing it to remove the peer and send the block request to some other peer.A recent change in libp2p has modified how negotiation works for streams using
V1Lazy
and for Polkadot it means that the negotiation failure is not detected andSyncingEngine
is not notified.If a parachain collator is selected for a block request, this will cause syncing to stall because after
MAX_AHEAD_BLOCKS
have been downloaded, the syncing will stop itself from downloading more blocks until all in-flight requests have been completed and since the collator cannot answer block requests andSyncingEngine
is not notified of the negotiation failure, syncing halts at 0.0 bps and the node has to be restarted.Fixes #14683
I'll check if I can find another way around this problem without disabling
V1Lazy
but in case I don't, we definitely want to include include this fix even if it disables an optimization.