Skip to content

fix: guard processSendQueue against non-writable stream status#3416

Open
lodekeeper wants to merge 5 commits intolibp2p:mainfrom
lodekeeper:fix/stream-drain-race-guard
Open

fix: guard processSendQueue against non-writable stream status#3416
lodekeeper wants to merge 5 commits intolibp2p:mainfrom
lodekeeper:fix/stream-drain-race-guard

Conversation

@lodekeeper
Copy link
Copy Markdown
Contributor

What

Add a writeStatus guard at the top of processSendQueue() in abstract-message-stream.ts to bail out when the stream is no longer writable.

Why

A race condition causes an uncaught StreamStateError when a drain event fires on a stream whose underlying transport is already closing:

  1. TCP connection starts closing → writeStatus transitions to 'closing'
  2. A drain event fires on the underlying TCP stream
  3. The noise layer relays this as safeDispatchEvent('drain') on the encrypted stream
  4. continueSendingOnDrainprocessSendQueue()sendData()send() on the now-closing stream
  5. send() throws StreamStateError — inside an event listener with no try/catch → uncaught exception

Observed on Lodestar mainnet (v1.41.0, @libp2p/utils@7.0.13): 6 uncaught exceptions in 7 days.

Fix

Single guard at the top of processSendQueue(), consistent with the existing bail-out pattern:

if (this.writeStatus !== 'writable') {
  this.log.trace('not processing send queue as stream write status is %s', this.writeStatus)
  return false
}

Test

Added a regression test that:

  1. Creates a stream pair with capacity: 1 to trigger backpressure
  2. Fills the send buffer until send() returns false
  3. Sets writeStatus to 'closing'
  4. Dispatches a drain event
  5. Asserts sendData is NOT called (the guard catches it) and no error is thrown

Closes #3415

@lodekeeper lodekeeper requested a review from a team as a code owner March 21, 2026 16:24
@lodekeeper lodekeeper changed the title fix(@libp2p/utils): guard processSendQueue against non-writable stream status fix(@libp2p/utils): guard processSendQueue on non-writable stream Mar 21, 2026
@lodekeeper lodekeeper changed the title fix(@libp2p/utils): guard processSendQueue on non-writable stream fix: guard processSendQueue against non-writable stream status Mar 21, 2026
@lodekeeper lodekeeper force-pushed the fix/stream-drain-race-guard branch 3 times, most recently from 9538027 to c1c6584 Compare March 21, 2026 17:06
Add a try-catch around sendData() in processSendQueue() to catch
StreamStateError when the underlying transport closes between a drain
event and the send attempt. This prevents the error from propagating
as an uncaught exception.

Also add a writeStatus guard in continueSendingOnDrain to skip
processing when the stream has fully closed.

Fixes libp2p#3415
@lodekeeper lodekeeper force-pushed the fix/stream-drain-race-guard branch from c1c6584 to a8c2cf2 Compare March 21, 2026 17:06
@dozyio
Copy link
Copy Markdown
Collaborator

dozyio commented Apr 6, 2026

@lodekeeper - can you review this commit 17d9da8 please

@dozyio dozyio requested a review from tabcat April 6, 2026 12:14
@lodekeeper
Copy link
Copy Markdown
Contributor Author

@dozyio Reviewed commit 17d9da8 — looks good overall.

abstract-message-stream.ts: The willSend defensive copy + comment makes the intent clear — guard against sendData mutating the slice before throwing. The rename is cleaner than the original toSend requeue. No issues here.

Tests: Both cases are solid:

  • preserve on error: verifies buffer is intact after a StreamStateError on drain ✓
  • no duplication on transient error: the throwOnce pattern correctly proves data isn't counted twice across two drain events ✓

One minor note: the second test sets writeStatus = 'closing' before the first drain but the continueSendingOnDrain guard only bails on statuses that are neither 'writable' nor 'closing'. So 'closing' still reaches the try/catch — which is the right flow for the test to exercise. Just worth a comment in the test body if you want it to be self-documenting, but it's not required.

Happy to address any remaining feedback before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: uncaught StreamStateError when drain event fires on closing stream

3 participants