fix(channel): reconnect WhatsApp Web session and show QR on logout#3045
Conversation
When WhatsApp revokes the session (e.g. phone logout, device limit), Event::LoggedOut was only logged; the channel stayed in a broken state with no way to recover without a manual restart and no new QR code. Wrap the bot lifecycle in a reconnect loop: - On Event::LoggedOut, signal via a broadcast channel instead of only logging. - The listen() loop receives the signal, tears down the bot handle, deletes the stale session DB file, waits 3 s, then restarts. - On restart, no session file is found → fresh pairing flow begins → QR code is printed to stderr so the operator can re-link the device. - Ctrl-C / clean shutdown still exits the loop normally (no reconnect). No config changes, no new dependencies, no public API changes. Rollback: revert this commit; channel reverts to warn-only behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Note
|
| Cohort / File(s) | Summary |
|---|---|
WhatsApp Web channel src/channels/whatsapp_web.rs |
Replaces one-shot startup with a persistent session loop: expands ~ in session path each iteration, re-inits RusqliteStore, optionally loads device, supports WHATSAPP_WS_URL transport factory, adds UreqHttpClient, introduces logout broadcast channel, richer event handling (Connected, LoggedOut, StreamError, PairingCode/PairingQrCode, Incoming messages), sender candidate resolution and allowlist checks, session-file removal on logout to force re-pair, exponential backoff (BASE_DELAY_SECS, MAX_DELAY_SECS) with MAX_RETRIES, and bot handle lifecycle management (start/abort/cleanup). |
Sequence Diagram
sequenceDiagram
participant MainLoop as Main Loop
participant Store as Storage (RusqliteStore / session file)
participant Transport as Transport Factory
participant HTTP as Media HTTP Client
participant Bot as Bot Client
participant Events as Event Handler
participant LogoutCh as Logout Channel
MainLoop->>Store: Expand session path & (re)initialize store
MainLoop->>Transport: Create transport (opt. WHATSAPP_WS_URL)
MainLoop->>HTTP: Init media HTTP client (UreqHttpClient)
loop Session lifecycle per-iteration
MainLoop->>Store: Load saved device (if present)
alt device loaded
MainLoop->>Bot: Build & start bot (with store, transport, HTTP)
Bot->>Events: Emit events (Connected, Incoming, Pairing, StreamError, LoggedOut)
else no device => pairing
MainLoop->>Bot: Start pairing flow (QR / code)
Bot->>Events: Emit PairingQrCode / PairingCode
end
Events->>Bot: Handle Incoming -> resolve sender candidates -> allowlist check -> forward ChannelMessage
Events->>LogoutCh: Send logout on LoggedOut or explicit logout
LogoutCh->>MainLoop: Notify to cleanup
MainLoop->>Store: Remove session file (if logout forces re-pair)
MainLoop->>MainLoop: Compute backoff (exponential, capped) and increment retry
MainLoop->>MainLoop: Retry loop or exit after MAX_RETRIES
end
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested labels
size: S
Suggested reviewers
- theonlyhennygod
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title concisely summarizes the primary change: adding reconnection with QR display on WhatsApp logout. |
| Description check | ✅ Passed | The description comprehensively addresses all template sections with clear detail: problem/solution, metadata, scope boundaries, validation evidence (with cross-compile explanation), security/privacy/compatibility confirmations, human verification, risks/mitigations, and rollback plan. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
Tip
Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
|
@chumyin — tagging for assignee gate per contribution protocol. Please assign yourself if this is ready for triage. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/channels/whatsapp_web.rs (1)
506-508: Consider bounded retries or exponential backoff.A fixed 3-second delay mitigates rapid spinning, but on permanent auth failures (e.g., account banned), this loop will retry indefinitely. Consider adding a max retry count or exponential backoff with a cap to fail fast after repeated failures.
♻️ Example: bounded retry with exponential backoff
// At the start of listen(), before the loop: let mut retry_count = 0u32; const MAX_RETRIES: u32 = 10; const BASE_DELAY_SECS: u64 = 3; const MAX_DELAY_SECS: u64 = 300; // 5 minutes cap // ...inside the reconnect branch: if should_reconnect { retry_count += 1; if retry_count > MAX_RETRIES { anyhow::bail!( "WhatsApp Web: exceeded {} reconnect attempts, giving up", MAX_RETRIES ); } // ... file removal ... let delay = std::cmp::min( BASE_DELAY_SECS * 2u64.saturating_pow(retry_count - 1), MAX_DELAY_SECS, ); tracing::info!("WhatsApp Web: reconnecting in {}s (attempt {}/{})", delay, retry_count, MAX_RETRIES); tokio::time::sleep(std::time::Duration::from_secs(delay)).await; continue; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/whatsapp_web.rs` around lines 506 - 508, The reconnect loop in listen() currently uses a fixed 3s sleep which can cause infinite retries on permanent failures; implement a bounded retry with exponential backoff: add a retry counter (e.g., retry_count) and constants like MAX_RETRIES, BASE_DELAY_SECS and MAX_DELAY_SECS, increment retry_count on each reconnect attempt, bail/error out (or return) when retry_count > MAX_RETRIES, compute delay as min(BASE_DELAY_SECS * 2^(retry_count-1), MAX_DELAY_SECS), log attempt number and delay before sleeping, and reset retry_count on a successful connection to avoid permanent backoff.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 351-356: The current tracing::info! call logs raw message text
(variables sender, chat, text); change this to avoid sensitive data by either
using tracing::debug! for the full text and keeping info-level logs minimal, or
redact/truncate the text before logging (e.g., replace text with a
masked/first-N-chars version or "[REDACTED]") in the handler that calls
tracing::info! so that only non-sensitive context (sender, chat) is logged at
info level and full content is only available under debug/trace.
- Around line 296-299: The session path is passed raw to RusqliteStore::new
(using self.session_path) while cleanup uses shellexpand::tilde, causing
mismatched file locations; fix by expanding the path once up front (use
shellexpand::tilde or equivalent) into a single variable (e.g.,
expanded_session_path) and pass that expanded path to RusqliteStore::new and
store it for use in the cleanup logic so both creation and deletion operate on
the same expanded path.
---
Nitpick comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 506-508: The reconnect loop in listen() currently uses a fixed 3s
sleep which can cause infinite retries on permanent failures; implement a
bounded retry with exponential backoff: add a retry counter (e.g., retry_count)
and constants like MAX_RETRIES, BASE_DELAY_SECS and MAX_DELAY_SECS, increment
retry_count on each reconnect attempt, bail/error out (or return) when
retry_count > MAX_RETRIES, compute delay as min(BASE_DELAY_SECS *
2^(retry_count-1), MAX_DELAY_SECS), log attempt number and delay before
sleeping, and reset retry_count on a successful connection to avoid permanent
backoff.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 95da016e-d814-455e-ab0e-cf10054485f7
📒 Files selected for processing (1)
src/channels/whatsapp_web.rs
…d retries - Expand session_path with shellexpand::tilde upfront so RusqliteStore and session cleanup both use the same resolved path - Move raw message content from info! to debug! level to avoid logging sensitive payloads; info! now logs only message length - Replace fixed 3s reconnect delay with exponential backoff (3s base, 5m cap, max 10 retries) and reset counter on successful connection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 332-347: The current wait on logout_rx.recv() is unreachable for
an Err because the outer logout_tx remains in scope; drop the loop-owned sender
(logout_tx) before entering the select! in listen() so recv() can return Err
when all senders are gone, or alternatively include the bot task handle in the
select! (await the bot handle completion) so shutdown of the Bot::builder
spawned task is observed; ensure you reference and drop/logout_tx (or add the
bot handle join branch) in the same scope where you call logout_rx.recv() so the
recv() path for task shutdown can fire.
- Around line 487-489: The retry_count is being reset immediately after
bot.run() returns (resetting retry_count = 0) which allows a rapid loop on auth
failures; instead move and perform the reset only when the session proves
recovery (i.e., inside the Event::Connected branch of the event loop). Update
the logic around bot.run(), the event handling for Event::Connected and
Event::LoggedOut, and any other similar reset sites (referencing retry_count,
bot.run(), Event::Connected, LoggedOut, and MAX_RETRIES) so that retry_count is
cleared only when Event::Connected is observed; remove or undo resets at
bot.run() return and the duplicated area around the 509-540 handling.
- Around line 431-455: The Event::PairingCode and Event::PairingQrCode handlers
are currently emitting pairing credentials via tracing::info (and tracing::warn
logs the QR error), which exposes secrets to structured log sinks; change these
to print only to the terminal (e.g., eprintln! or stdout) and remove the raw
payload from any tracing:: logs inside the Event::PairingCode and
Event::PairingQrCode match arms and in the render_pairing_qr error path
(referenced symbols: Event::PairingCode, Event::PairingQrCode,
Self::render_pairing_qr, tracing::info); keep non-sensitive status messages in
tracing but ensure the actual code/QR payload is written only to terminal output
and not passed to tracing macros.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 462572f9-e229-478f-b94e-4a351167875f
📒 Files selected for processing (1)
src/channels/whatsapp_web.rs
…reset - Drop outer logout_tx before select! so recv() returns Err when the bot task dies without emitting LoggedOut, preventing listen() from hanging indefinitely on a dead session - Remove pair code and QR payload from tracing::info! (login secrets); print them only to stderr via eprintln! for operator visibility - Move retry_count reset from bot.run() return into Event::Connected handler via Arc<AtomicU32> so rapid logout loops without successful connection correctly accumulate toward MAX_RETRIES Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/channels/whatsapp_web.rs (1)
360-418:⚠️ Potential issue | 🟠 MajorRemove raw user data from these logs.
Line 367 still emits the full inbound message body, and Line 415 dumps normalized sender candidates. Both are sensitive channel data; keep only metadata/counts here.
🛡️ Minimal safe change
tracing::info!( "WhatsApp Web message from {} in {} (len={})", sender, chat, text.len() ); - tracing::debug!( - "WhatsApp Web message content: {}", - text - ); + tracing::debug!("WhatsApp Web message received"); ... } else { tracing::warn!( - "WhatsApp Web: message from {} not in allowed list (candidates: {:?})", - sender_jid, - sender_candidates + "WhatsApp Web: message from {} not in allowed list", + sender_jid ); }As per coding guidelines: "Deny-by-default for access and exposure boundaries; never log secrets, raw tokens, or sensitive payloads; keep network/filesystem/shell scope as narrow as possible unless explicitly justified"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/whatsapp_web.rs` around lines 360 - 418, The logs currently emit raw message body and sensitive sender candidates; update the tracing calls in this block to avoid logging raw user data: remove the tracing::debug that logs "WhatsApp Web message content: {}" with text and change the tracing::warn that dumps sender_candidates and sender_jid to log only non-sensitive metadata (e.g., message length, trimmed-empty flag, and the count of sender_candidates or a masked/hashed identifier). Ensure references to Self::sender_phone_candidates, sender_jid, tx_inner.send(ChannelMessage { ... }), and the tracing calls are updated so no raw message text or full phone candidate lists are emitted.
🧹 Nitpick comments (1)
src/channels/whatsapp_web.rs (1)
291-555: Add regression coverage for the reconnect lifecycle.This is now the core
listen()path, but the test module still does not exerciseLoggedOut→ cleanup → retry/backoff or theEvent::Connectedretry reset. Even a small helper-level test around that state machine would make the newlisten/health_checksemantics much safer.As per coding guidelines: "Implement
Channeltrait insrc/channels/, keepsend,listen,health_check, typing semantics consistent, cover auth/allowlist/health behavior with tests"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/whatsapp_web.rs` around lines 291 - 555, The test suite is missing regression coverage for the reconnect lifecycle inside listen(): add tests that drive the Event flow (simulate Event::LoggedOut causing session cleanup and reconnect/backoff) and verify retry_count is incremented and capped by MAX_RETRIES and that Event::Connected resets retry_count to 0; specifically, write helper-level tests that exercise WhatsAppWebChannel::listen (or the test harness that builds the Bot) to assert that when LoggedOut is emitted the code removes the session file (the expanded_session_path behavior), schedules a backoff delay using BASE_DELAY_SECS/ MAX_DELAY_SECS logic, and retries up to MAX_RETRIES, and separately that a Connected event resets the AtomicU32 retry_count to zero; use mocks/stubs for Bot.run()/on_event and RusqliteStore (or inject a temp session file) to avoid real network I/O and to observe calls to tokio::fs::remove_file and the retry atomic (retry_count) and health_check/send behavior as part of the assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 512-536: The cleanup currently aborts the bot task and then
removes the session DB file while the SQLite-backed session store (and variables
like bot, device, backend) are still held, risking WAL/SHM sidecars and races;
before deleting expanded_session_path, explicitly drop/close the SQLite session
store instance (release any references to the DB so PRAGMA WAL files are closed)
and fully await shutdown of the bot task instead of just calling handle.abort()
(i.e., call abort and then await the JoinHandle so background I/O finishes),
then remove the primary DB file and its "-wal" and "-shm" sidecar files (handle
NotFound errors) to ensure a clean fresh re-pair; use the existing symbols
self.client.lock(), self.bot_handle.lock().take(), retry_count, and
expanded_session_path to locate where to insert the drop/await and sidecar
deletions.
---
Duplicate comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 360-418: The logs currently emit raw message body and sensitive
sender candidates; update the tracing calls in this block to avoid logging raw
user data: remove the tracing::debug that logs "WhatsApp Web message content:
{}" with text and change the tracing::warn that dumps sender_candidates and
sender_jid to log only non-sensitive metadata (e.g., message length,
trimmed-empty flag, and the count of sender_candidates or a masked/hashed
identifier). Ensure references to Self::sender_phone_candidates, sender_jid,
tx_inner.send(ChannelMessage { ... }), and the tracing calls are updated so no
raw message text or full phone candidate lists are emitted.
---
Nitpick comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 291-555: The test suite is missing regression coverage for the
reconnect lifecycle inside listen(): add tests that drive the Event flow
(simulate Event::LoggedOut causing session cleanup and reconnect/backoff) and
verify retry_count is incremented and capped by MAX_RETRIES and that
Event::Connected resets retry_count to 0; specifically, write helper-level tests
that exercise WhatsAppWebChannel::listen (or the test harness that builds the
Bot) to assert that when LoggedOut is emitted the code removes the session file
(the expanded_session_path behavior), schedules a backoff delay using
BASE_DELAY_SECS/ MAX_DELAY_SECS logic, and retries up to MAX_RETRIES, and
separately that a Connected event resets the AtomicU32 retry_count to zero; use
mocks/stubs for Bot.run()/on_event and RusqliteStore (or inject a temp session
file) to avoid real network I/O and to observe calls to tokio::fs::remove_file
and the retry atomic (retry_count) and health_check/send behavior as part of the
assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c5ba89fa-72d2-400e-b83a-8fe3504c0131
📒 Files selected for processing (1)
src/channels/whatsapp_web.rs
- Await the aborted bot JoinHandle so background I/O finishes before file deletion, preventing races with SQLite writes - Explicitly drop bot/device/backend to close the SQLite connection and release WAL/SHM file locks before removal - Delete -wal and -shm sidecar files alongside the primary DB to ensure a fully clean session for fresh QR re-pairing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/channels/whatsapp_web.rs (1)
355-418:⚠️ Potential issue | 🟠 MajorRemove raw message bodies and sender identifiers from logs.
This handler still writes message text at Line 366 and raw sender/chat/candidate identifiers at Lines 360-365 and 414-418. That leaks user payloads and phone/JID data into log sinks. Keep only redacted metadata such as lengths or counts.
As per coding guidelines: "never log secrets, raw tokens, or sensitive payloads".🔐 Safer logging shape
- let sender = sender_jid.user().to_string(); let chat = info.source.chat.to_string(); - tracing::info!( - "WhatsApp Web message from {} in {} (len={})", - sender, - chat, - text.len() - ); - tracing::debug!( - "WhatsApp Web message content: {}", - text - ); + tracing::info!( + "WhatsApp Web message received (len={})", + text.len() + ); ... - tracing::warn!( - "WhatsApp Web: message from {} not in allowed list (candidates: {:?})", - sender_jid, - sender_candidates - ); + tracing::warn!( + "WhatsApp Web: message rejected by allowlist (candidate_count={})", + sender_candidates.len() + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/whatsapp_web.rs` around lines 355 - 418, Remove logging of raw message bodies and full sender/chat identifiers: stop emitting `text`, `sender_jid`, `sender_alt`, `mapped_phone` and `sender_candidates` contents. In the blocks around the calls to Self::sender_phone_candidates, Self::is_number_allowed_for_list, and when sending ChannelMessage via tx_inner, replace the tracing::info/debug/warn messages to only emit non-sensitive metadata such as text length (e.g., text.len()), counts (e.g., sender_candidates.len()), and redacted/minimal identifiers (e.g., masked sender or chat length) or boolean flags (allowed/blocked). Ensure ChannelMessage still contains the real values, but logs never print full phone/JID or message body; reference symbols to change: `tracing::info!`/`tracing::debug!`/`tracing::warn!`, `sender_jid`, `sender_alt`, `mapped_phone`, `sender_candidates`, `tx_inner`, and `ChannelMessage`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 500-551: The code currently treats logout_rx.recv() Err the same
as an explicit logout and always deletes the session DB; change the logic to
only purge session files when an actual Event::LoggedOut was observed. Add a
shared boolean/atomic (e.g., logged_out_flag) that your bot/event handler sets
to true when it emits Event::LoggedOut, update the select! handling around
logout_rx.recv() to distinguish Ok(()) vs Err and set should_reconnect
accordingly, and wrap the session file removal loop (the for loop that removes
expanded_session_path, "-wal", "-shm") in a check of that logged_out_flag so
files are only removed after a confirmed logout. Ensure symbols referenced:
logout_rx.recv(), Event::LoggedOut, retry_count, expanded_session_path, and the
session file removal loop are updated.
---
Duplicate comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 355-418: Remove logging of raw message bodies and full sender/chat
identifiers: stop emitting `text`, `sender_jid`, `sender_alt`, `mapped_phone`
and `sender_candidates` contents. In the blocks around the calls to
Self::sender_phone_candidates, Self::is_number_allowed_for_list, and when
sending ChannelMessage via tx_inner, replace the tracing::info/debug/warn
messages to only emit non-sensitive metadata such as text length (e.g.,
text.len()), counts (e.g., sender_candidates.len()), and redacted/minimal
identifiers (e.g., masked sender or chat length) or boolean flags
(allowed/blocked). Ensure ChannelMessage still contains the real values, but
logs never print full phone/JID or message body; reference symbols to change:
`tracing::info!`/`tracing::debug!`/`tracing::warn!`, `sender_jid`, `sender_alt`,
`mapped_phone`, `sender_candidates`, `tx_inner`, and `ChannelMessage`.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 16a049b8-d6f8-438c-9889-25a3c200be96
📒 Files selected for processing (1)
src/channels/whatsapp_web.rs
|
Core reconnect logic looks solid — exponential backoff, broadcast signaling, and session cleanup are well-structured. Two items before merge:
Happy to approve once those are addressed. |
Move raw sender and chat JID values from info/warn level to debug level. Info and warn logs now show only lengths/counts per project security conventions for structured logging of identifiable data. Addresses review feedback from rikitrader on PR zeroclaw-labs#3045.
|
Addressed the review feedback on log redaction: Lines 360-365 (info-level): Replaced raw Lines 414-418 (warn-level): Replaced raw This aligns with the project security conventions in |
Add a shared AtomicBool (session_revoked) that the Event::LoggedOut handler sets to true. Session DB and WAL/SHM sidecars are only deleted when this flag is set. A transient bot task crash (recv Err without LoggedOut) now reconnects with the existing session intact, avoiding unnecessary re-pairing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 6 unit tests covering the reconnect lifecycle: - Exponential backoff calculation with cap validation - Retry counter reset on Event::Connected - MAX_RETRIES exceeded detection - session_revoked flag guards file deletion - Broadcast channel signals reconnect on LoggedOut - Dropped sender returns Err (bot crash without LoggedOut) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace raw phone numbers and JID strings with lengths/counts at info and warn log levels per project security conventions. Raw values remain available at debug level for local troubleshooting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/channels/whatsapp_web.rs (1)
323-327: Please avoid adding a hidden transport override in this reconnect patch.
WHATSAPP_WS_URLintroduces a new runtime branch that is outside the stated reconnect scope and broadens the channel’s network behavior without any visible docs/tests in this PR. If there is a real operator use case, I’d split it into a separately justified change; otherwise keep the endpoint fixed here.As per coding guidelines: "Do not add new config keys, trait methods, feature flags, or workflow branches without a concrete accepted use case; do not introduce speculative 'future-proof' abstractions without at least one current caller."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/whatsapp_web.rs` around lines 323 - 327, Remove the hidden runtime branch that reads WHATSAPP_WS_URL and calling TokioWebSocketTransportFactory::with_url; revert to using the fixed transport configuration used for reconnects by deleting the std::env::var("WHATSAPP_WS_URL") check and the transport_factory.with_url(ws_url) override so that the transport_factory variable is created deterministically via TokioWebSocketTransportFactory::new() only; if an operator-configurable endpoint is truly required, implement it in a separate PR with docs/tests rather than in this reconnect change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 822-910: Tests are only exercising atomics and channel primitives,
not the real reconnect/cleanup logic; extract the decision logic into small
production helpers (e.g., compute_retry_delay(attempt: u32) -> Duration,
handle_connected_event(retry_count: &AtomicU32),
should_purge_session(session_revoked: &AtomicBool) -> bool) and update the
reconnect state machine (the Channel impl: listen(), send(), health_check() on
the WhatsApp channel) to call those helpers so unit tests can directly exercise
them; then replace the current tests to call these new helpers (assert
compute_retry_delay matches expected capped doubling, call
handle_connected_event to verify retry reset, call should_purge_session to
verify purge gating) or add a focused integration test that drives the real
listen() state machine on the WhatsAppWeb channel to verify end-to-end
reconnect, purge, and logout/broadcast behavior.
---
Nitpick comments:
In `@src/channels/whatsapp_web.rs`:
- Around line 323-327: Remove the hidden runtime branch that reads
WHATSAPP_WS_URL and calling TokioWebSocketTransportFactory::with_url; revert to
using the fixed transport configuration used for reconnects by deleting the
std::env::var("WHATSAPP_WS_URL") check and the
transport_factory.with_url(ws_url) override so that the transport_factory
variable is created deterministically via TokioWebSocketTransportFactory::new()
only; if an operator-configurable endpoint is truly required, implement it in a
separate PR with docs/tests rather than in this reconnect change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9487cd15-dbef-4495-8d94-dd2f9d3a623b
📒 Files selected for processing (1)
src/channels/whatsapp_web.rs
Move retry/cleanup decision logic into small associated functions on WhatsAppWebChannel (compute_retry_delay, record_retry, reset_retry, should_purge_session, session_file_paths) so listen() delegates to them and tests exercise the same code paths used in production. Replaces tests that only validated stdlib primitives (AtomicU32, broadcast) with tests that call the actual helpers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`backend` is moved into `Bot::builder().with_backend(backend)`, so calling `drop(backend)` afterwards is a use-after-move compile error. Dropping `bot` already releases the Arc reference to the storage backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d files Auto-formatted with rustfmt 1.8.0-stable (Rust 1.92) to match CI expectations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
theonlyhennygod
left a comment
There was a problem hiding this comment.
LGTM. Reviewed the reconnect state machine, resource cleanup, atomics, and test coverage.
Fixes applied:
- Removed
drop(backend)use-after-move (backend is moved intowith_backend(), sodrop(bot)already releases the Arc reference) - Fixed rustfmt formatting for whatsapp_web.rs and pre-existing issues in matrix.rs, loop_.rs, etc.
Design is sound: broadcast channel for logout signaling, exponential backoff (3s-300s, max 10 retries), session file cleanup only on explicit LoggedOut, proper resource ordering (client lock → abort handle → drop bot/device).
…session-reconnect-qr # Conflicts: # src/channels/mod.rs
Reverts formatting-only changes to mod.rs, traits.rs, loop_.rs, and observability/traits.rs to avoid triggering pre-existing clippy warnings that are unrelated to this PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
theonlyhennygod
left a comment
There was a problem hiding this comment.
LGTM — all CI checks pass. Reconnect logic is sound.
…eroclaw-labs#3045) Wraps the WhatsApp Web listen() in a reconnect loop. When Event::LoggedOut fires, the bot is torn down, stale session DB files are deleted, and after exponential backoff (3s base, 300s cap, max 10 retries) the loop restarts triggering fresh QR pairing. - Broadcast channel for logout signaling from event handler to listen loop - Session file cleanup (primary + WAL + SHM) only on explicit LoggedOut - Proper resource ordering: client lock, abort handle, drop bot/device - Tests for retry delay, counter, session purge, and file paths helpers
…eroclaw-labs#3045) Wraps the WhatsApp Web listen() in a reconnect loop. When Event::LoggedOut fires, the bot is torn down, stale session DB files are deleted, and after exponential backoff (3s base, 300s cap, max 10 retries) the loop restarts triggering fresh QR pairing. - Broadcast channel for logout signaling from event handler to listen loop - Session file cleanup (primary + WAL + SHM) only on explicit LoggedOut - Proper resource ordering: client lock, abort handle, drop bot/device - Tests for retry delay, counter, session purge, and file paths helpers
Summary
masterEvent::LoggedOutwas only logged as a warning — the channel stayed in a permanently broken state requiring a manual process restart. No QR code was shown.listen(). OnLoggedOut, a broadcast signal is sent to the loop, the bot is torn down, the stale session DB file is deleted, and after a 3-second backoff the loop restarts — triggering fresh QR pairing printed to stderr.Channeltrait signature, allowed-numbers logic, pairing-code flow, media handling, or any other subsystem.Label Snapshot (required)
risk: lowsize: Schannelchannel: whatsappChange Metadata
bugchannelLinked Issue
Supersede Attribution (required when
Supersedes #is used)N/A — this PR does not supersede any prior PR.
Validation Evidence (required)
src/channels/whatsapp_web.rswith no new dependencies, no schema changes, and no logic outside thelisten()function body.listen()control flow — all existing logic is preserved verbatim; only the loop wrapper and broadcast channel are added.shellexpand(already a dependency at3.1) andtokio::time::sleep(already in scope) are the only additions.Security Impact (required)
session_pathon logout; no new path accessYes: N/APrivacy and Data Hygiene (required)
passCompatibility / Migration
i18n Follow-Through (required when docs or user-facing wording changes)
Human Verification (required)
listen()function read in full; fix applied surgically to that exact version (not the android-branch variant)listen()is modified; all existing event handlers, pairing-code flow, and message routing are unchangedshellexpand::tildeusage matches existing pattern insrc/security/estop.rs:255logout_tx.clone()captured by themoveclosure; each event handler invocation re-clones from the captured copy — correct for multi-call closuresErrorKind::NotFoundarm (no-op)logout_txwithout sending:logout_rx.recv()returnsErr(RecvError)which is also treated as reconnect signalSide Effects / Blast Radius (required)
src/channels/whatsapp_web.rs—listen()onlytracing::warn!andtracing::info!log lines will show reconnect attempts in the runtime logAgent Collaboration Notes (recommended)
listen()→ identified missing reconnect loop → applied surgical patch to upstream version in isolated worktree → committed and pushed from fork → PR to upstream masterchannels/module, no cross-subsystem coupling introducedRollback Plan (required)
git revert d84f2ca6— single commit, clean revert restores warn-only behaviourRisks and Mitigations
NotFoundis a no-op (safe); other errors are logged as warnings and the loop continues — worst case is the next pairing attempt has a stale DB which wa-rs will handle or error on cleanly.Summary by CodeRabbit
New Features
Bug Fixes
Improvements