Skip to content

adapter-openclaw: Telegram chat turns silently dropped after /reset — ~50% persistence gap in W4b path #335

@Jurij89

Description

@Jurij89

Problem

Roughly half of Telegram chat turns are silently dropped after a /reset command. The persistence path appears to enter a degraded state: turns from before the first /reset persist normally to the WM chat-turns assertion, but turns sent after the /reset do not — even though the gateway log shows [ChatTurnWriter] Persisted turn (sessionId=…, turnId=…) lines fire for them.

This is independent of the memory-slot recall path (T70/T72/T73) — those are search-side; this is a write-side gap. The auto-recall hook and memory_search tool both correctly return what's in the graph; the graph just doesn't have everything that was said.

Reproduction

Live test on a node running feat/dkg-memory-integration at HEAD (commit 1ddd3a6b):

  1. Start gateway, send a Telegram message (turn A).
  2. Wait for [ChatTurnWriter] Persisted turn ... in gateway log → confirms write path.
  3. SPARQL count of chat-turns assertion in WM: baseline N triples.
  4. Send 5 more Telegram turns mentioning a unique keyword (e.g., "Marbella").
  5. SPARQL count: should be N + 5×20 = N+100 triples (each turn pair writes ~20 triples).
  6. SPARQL keyword count for "marbella": should be ~10 (5 user texts + 5 assistant replies).
  7. Send /reset in Telegram.
  8. Send 5 more turns mentioning the same keyword.
  9. SPARQL count: should be N+200, keyword count should be ~20.

Observed: post-/reset turns underweight. In one repro run:

  • Pre-/reset: 4 of 4 turns about "Marbella" persisted with the keyword in their text (msg_id 921, 927 + 2 assistant replies), all from 19:56-19:57 window.
  • Post-/reset: ~12 user turns + ~12 assistant replies mentioning "Marbella" between 20:11-20:19. Total chat-turns triple count grew by only ~180 triples (≈9 turn pairs of 20 triples each), and zero of those new triples contain "marbella" in their text.
# Diagnostic SPARQL (executed at end of run, after all chat activity)
$ curl -s -X POST \"http://127.0.0.1:9200/api/query\" \
  -H \"Authorization: Bearer \$TOK\" \
  -H \"Content-Type: application/json\" \
  --data '{
    \"sparql\": \"SELECT (COUNT(*) AS ?c) WHERE { ?s ?p ?o }\",
    \"contextGraphId\": \"agent-context\",
    \"view\": \"working-memory\",
    \"agentAddress\": \"<eth>\",
    \"assertionName\": \"chat-turns\"
  }'
# → 1024 triples (vs ~1244 expected from full conversation)

$ curl -s -X POST \"http://127.0.0.1:9200/api/query\" \
  -H \"Authorization: Bearer \$TOK\" \
  -H \"Content-Type: application/json\" \
  --data '{
    \"sparql\": \"SELECT (COUNT(*) AS ?c) WHERE { ?s ?p ?o . FILTER(isLiteral(?o)) FILTER(CONTAINS(LCASE(STR(?o)), \\\"marbella\\\")) }\",
    \"contextGraphId\": \"agent-context\",
    \"view\": \"working-memory\",
    \"agentAddress\": \"<eth>\",
    \"assertionName\": \"chat-turns\"
  }'
# → 4 matches (vs ~24 expected)

The 4 matches were all from before the first /reset.

Hypothesis

The W4b persist path (internal:message:received + internal:message:sent internal hooks → ChatTurnWriter.onMessageReceived / onMessageSent) maintains session-scoped state: a FIFO queue of inbound user messages, a watermark for cross-path dedup, and a markTurnIdSeen reservation set. On /reset, OpenClaw's session-end signal clears its in-memory chat history, but the Writer's internal state may not be cleared in lock-step. Possible failure modes:

  1. Stale FIFO entries — pre-/reset inbound user messages sit in the queue, post-/reset assistant replies pair against them, the resulting turnId collides with already-seen content, dedup drops the turn.
  2. Stale watermark — the cross-path dedup watermark holds a pre-/reset pairIndex, and post-/reset turns whose computed pairIndex is ≤ watermark are skipped as "already persisted."
  3. Stale markTurnIdSeen reservation — turnIds reserved before /reset are still cached, and post-/reset turns happen to compute the same turnId (content-hash collision after stripRecalledMemory) and get skipped.
  4. Session-key mismatch/reset may rotate the session key encoding (openclaw:telegram:::agent%3Amain%3Amain vs a new one), and the Writer's per-session state lookup misses, causing onMessageReceived/Sent to no-op.

Investigation paths

  • Check whether OpenClaw's session-reset event is observed by ChatTurnWriter (it has onBeforeReset for typed hook before_reset, but Telegram's W4b path uses internal hooks — the reset may not propagate).
  • Add log lines at every dedup decision point in persistOne so we can see why a turn was skipped (FIFO empty, watermark ≥ pairIndex, turnId already seen, or none-of-the-above).
  • Reproduce in a unit test: simulate internal:message:received + internal:message:sent events with a before_reset interleaved, assert all turn pairs persist.
  • Verify the daemon-side assertion graph URI doesn't change scope across /reset (rule out write-side scope drift).

Out of scope

  • T70 / T72 / T73 — resolveDkgHome resolver. Confirmed unrelated; reads correctly find what's persisted.
  • T74 — caller-tag log. Made this bug visible (the caller=tool, limit=20, raw_hits=5 log entry showed the tool was firing correctly, isolating the issue to write-side).

Severity

Medium-high. Functionally: users who issue /reset mid-conversation lose recall of all subsequent turns. Recall queries return only pre-/reset content; the agent appears to forget anything said after a reset. In practice this looks like a memory bug from the user's perspective ("why doesn't the agent remember the conversation we just had?").

Context

Surfaced during PR #264 live-test on 2026-04-29. Filed as post-merge follow-up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions