fix: prevent infinite loop in _updateSubscriptions when WebSocket is closing#3778
Open
Kropiunig wants to merge 1 commit intosolana-foundation:maintenance/v1.xfrom
Conversation
…closing isCurrentConnectionStillActive() only checked the connection generation counter, which doesn't change until the 'close' event fires. When the underlying WebSocket enters the CLOSING (readyState 2) or CLOSED (readyState 3) state before that event propagates, RPC calls fail immediately but the guard considers the connection active, causing _updateSubscriptions() to retry indefinitely in a tight recursive loop. This manifests in production as a storm of error logs: accountUnsubscribe error: Tried to call a JSON-RPC method `accountUnsubscribe` but the socket was not `CONNECTING` or `OPEN` (`readyState` was 2) The fix adds a readyState check to isCurrentConnectionStillActive() so that it returns false when the socket is no longer in a usable state (CONNECTING or OPEN), breaking the retry loop. Changes: - Expose readyState getter on RpcWebSocketClient (delegates to the underlying socket, defaults to CLOSED when no socket exists) - Guard isCurrentConnectionStillActive() against CLOSING/CLOSED state - Update test stub to track readyState based on mock open/close state (fixes 14 pre-existing test failures in retry test suite) - Add regression tests for CLOSING-during-unsubscribe and CLOSED-during-subscribe scenarios Fixes solana-foundation#3761
Author
|
Hi @jacobcreech — pinging for visibility. This PR fixes an infinite spin-loop when a WebSocket transitions to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fixes #3761
When a WebSocket connection transitions to
CLOSING(readyState 2) orCLOSED(readyState 3) state, there is a window between the socket becoming unusable and thecloseevent firing. During this window:_rpcWebSocketConnectedis stilltrue(it only flips when_wsOnCloseruns)_rpcWebSocketGenerationhasn't incremented yetisCurrentConnectionStillActive()still returnstrueAny RPC call (
call/notify) on theRpcWebSocketClientwill reject immediately with:But the error handler in
_updateSubscriptions()sees the connection as still active, resets the subscription state, and calls_updateSubscriptions()recursively — which immediately fails again, creating a tight infinite loop that floods logs and can crash the process.Real-world impact: This has been observed in production with Drift Protocol SDK users on Railway, where the log storm reached 500 logs/sec and caused instance crashes.
Root Cause
isCurrentConnectionStillActive()only checks the generation counter:This misses the case where the socket is dying but the
closeevent hasn't fired yet. ThereadyStatetransitions happen synchronously on the socket, but thecloseevent is asynchronous — there's a real gap where the generation counter is stale.Solution
Expose
readyStateonRpcWebSocketClientvia a getter that delegates to the underlying socket (defaults toCLOSEDwhen no socket exists)Guard
isCurrentConnectionStillActive()to also verify the socket is in a usable state (CONNECTINGorOPEN):This breaks the recursive loop immediately when the socket is no longer usable, allowing the normal reconnection flow to take over when the
closeevent eventually fires.Changes
src/rpc-websocket.tsreadyStategetter (+3 lines)src/connection.tsisCurrentConnectionStillActive()with readyState check (+10 lines)test/connection-subscriptions.test.tsTesting
readyState— fixed by having the stub track open/close state)_updateSubscriptions()doesn't exceed a reasonable call count (< 5), catching any infinite loopNotes
0,1,3) instead ofWebSocket.CONNECTING/OPEN/CLOSEDto avoid Node.js/browser compatibility issues, consistent with the existing pattern inrpc-websocket.ts