Open
Conversation
…l sync Initialize warp sync source finalized_block_height with best_block_number from gossip handshake instead of 0, so warp sync triggers immediately without waiting for a GrandPa neighbor packet. Also exclude UnknownTargetBlock justification errors from the 40s ban, as they are benign during initial catchup sync.
Add substream_id to connection-activity log in multi-stream task loop to identify which WebRTC data channel each read/write belongs to. Add GossipInboundResult event to surface inbound notification substream outcomes (accepted, rejected duplicate, rejected cold-open) for debugging. Replace immediate Tx/Grandpa substream retry-on-failure with a 2-second deferred retry to avoid starving WebRTC connections with rapid substream open attempts.
Route browser-side WebRTC diagnostics through smoldot's log system via a new logCallback field on ConnectionConfig. Add a hasNegotiated guard to prevent unnecessary SDP re-negotiation in webrtc-direct mode, and handle already-open inbound data channels whose onopen event was missed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes several issues that cause smoldot light clients connecting over WebRTC to stall during or after warp sync. These were observed against a litep2p-based Polkadot node where the light client would connect, complete the warp sync handshake, but then fail to receive block announcements or Grandpa messages.
Root causes identified and fixed
Missed wakeup in WebRTC multi-stream task loop — When the coordinator accepts an inbound notification substream (
AcceptInNotifications), the handshake response bytes are queued internally, but the platform'swait_read_write_againonly wakes on incoming network data or timers. On WebRTC, each substream is a separate data channel, so if the remote peer is waiting for the handshake response before sending data on that channel, neither side makes progress. Fixed by adding anevent_listener::Eventthat wakes all substream futures wheninject_coordinator_messagequeues write data.Warp sync source starts with
finalized_block_height = 0— When a new peer is added as a warp sync source, itsfinalized_block_heightwas initialized to 0 instead of thebest_block_numberreported in the gossip handshake. Warp sync only triggers whensource.finalized_block_height > warped_header_number + 32, so a source at height 0 would never trigger warp sync until a Grandpa neighbor packet arrived (which requires established notification substreams — a chicken-and-egg problem on first connect). Fixed by threadingbest_block_numberthroughadd_source().UnknownTargetBlockjustification error causes 40-second ban — During initial sync, the finality target block hasn't been imported yet, so justification verification returnsUnknownTargetBlock. This was treated as a ban-worthy error, preventing the only connected peer from being used for 40 seconds. Fixed by excluding this specific error from the ban logic in the standalone sync service.Tx/Grandpa outbound substream retry hammering — When outbound Transactions or Grandpa notification substreams are refused by the peer, smoldot retried immediately with zero delay in a tight loop (~30 retries/second). On WebRTC, this starves the connection and prevents other traffic from flowing. Additionally, litep2p requires these outbound substreams to be negotiated before it considers the notification protocols established — without them, it won't send block announcements even though the block announce substream itself was successfully negotiated. Fixed by replacing the immediate retry with a deferred retry queue: failed attempts are stored with a 2-second delay, processed at the top of
next_event(), and an async timer branch in the event loop ensures retries fire at the correct time rather than piggy-backing on unrelated events.Changes by commit
1. Fix WebRTC notification handshake stall in multi-stream task loop
coordinator_write_readyevent to wake substream futures wheninject_coordinator_messagequeues write datawait_read_write_againagainst the write-ready notification2. Fix warp sync initialization and UnknownTargetBlock ban during initial sync
warp_sync::AddSourcenow takesbest_block_numberparameter, used as initialfinalized_block_heightbest_block_numberthroughall::AddSource*structsUnknownTargetBlockjustification errors from peer ban in standalone sync service3. Add diagnostic logging and deferred notification retry mechanism
substream_idtoconnection-activitylog line in multi-stream task loopGossipInboundResultevent to surface inbound notification substream outcomesPendingNotificationOutRetryqueue (2-second delay)next_notification_retry_time()method toChainNetworkfor caller timer integration4. Add WebRTC diagnostic logging and fix inbound data channel handling
logCallbackhasNegotiatedguard to prevent unnecessary SDP re-negotiationonopenevent was missed