feat(vscode): add retry logic and auto-reconnect for ACP connection#2666
feat(vscode): add retry logic and auto-reconnect for ACP connection#2666tanzhenxin merged 2 commits intoQwenLM:mainfrom
Conversation
- Add connectWithRetry() method with exponential backoff (3 retries max) - Add cleanupForRetry() to clean partial state between retry attempts - Improve readiness detection with proper timeout and exit handling - Add timeout for ACP initialize handshake (15s) - Add auto-reconnect on unexpected subprocess exit (max 1 attempt) - Track intentionalDisconnect flag to skip auto-reconnect for user-initiated disconnects - Add onAutoReconnectFailed callback for user notification when auto-reconnect fails - Show VS Code warning notification with 'Retry Connection' action button - Add comprehensive tests for retry logic, cleanup, and intentional disconnect flag Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
📋 Review SummaryThis PR adds robust retry logic and auto-reconnect functionality to the VS Code IDE companion's ACP (Agent Client Protocol) connection handling. The implementation is well-structured with comprehensive test coverage, proper timeout handling, and thoughtful error management. Overall, this is a solid improvement to the extension's reliability. 🔍 General Feedback
🎯 Specific Feedback🟡 High
🟢 Medium
🔵 Low
✅ Highlights
|
…scode-acp-reconnect-logic
revert: PR #2666 ACP retry/reconnect logic
TLDR
This PR adds robust retry logic and auto-reconnect functionality to the VS Code IDE companions ACP (Agent Client Protocol) connection handling. Key changes include:
connectWithRetry()method with exponential backoff (up to 3 retries)Screenshots / Video Demo
Dive Deeper
Motivation
Previously, if the Qwen CLI subprocess failed to start or crashed unexpectedly, the VS Code extension would fail silently or require manual reconnection. This improves reliability by:
Handling transient spawn failures: Network issues, race conditions, or temporary resource constraints can cause subprocess spawn to fail. Retry logic with exponential backoff helps recover from these transient failures.
Automatic recovery from crashes: If the CLI process crashes during a session (e.g., due to OOM, SIGTERM), the extension now attempts to automatically reconnect once, preserving user workflow continuity.
Better timeout management: The previous fixed 1s delay for readiness was unreliable. Now we use proper event-driven readiness detection with a 10s timeout, and the initialize handshake has a 15s timeout.
Implementation Details
connectWithRetry()wraps the fullconnect()call with retry logiccleanupForRetry()ensures clean state between retry attempts by killing zombie processes and resetting connection stateintentionalDisconnectflag prevents auto-reconnect when user explicitly disconnectsautoReconnectAttemptscounter prevents infinite reconnection loopsonAutoReconnectFailedcallback notifies the WebViewProvider to show user-facing error messagesReviewer Test Plan
Testing Matrix
Linked issues / bugs
Resolves #2634
🤖 Generated with Qwen Code