Skip to content

fix(chat): persist aborted partial history#418

Merged
penso merged 6 commits intomainfrom
issue-186
Mar 13, 2026
Merged

fix(chat): persist aborted partial history#418
penso merged 6 commits intomainfrom
issue-186

Conversation

@penso
Copy link
Copy Markdown
Collaborator

@penso penso commented Mar 11, 2026

Summary

  • persist partial streamed assistant output on chat.abort so interrupted runs keep the visible assistant text and reasoning in session history
  • persist assistant tool-call frames alongside tool results so tool-heavy interrupted runs round-trip back into prompt history
  • cache aborted partial assistant messages in the web UI session-history store and add Rust plus Playwright regressions for abort recovery

Fixes #186.

Validation

Completed

  • cargo test -p moltis-chat abort_persists_partial_stream_and_followup_reuses_it -- --nocapture
  • cargo test -p moltis-chat send_sync_persists_tool_call_assistant_frames_for_history_replay -- --nocapture
  • biome check --write crates/web/src/assets/js/websocket.js crates/web/ui/e2e/specs/chat-abort.spec.js
  • cd crates/web/ui && ./build.sh
  • cargo +nightly-2025-11-30 fmt --all -- --check
  • cargo +nightly-2025-11-30 build --bin moltis
  • cd crates/web/ui && MOLTIS_BINARY=/Users/penso/.superset/worktrees/moltis/issue-186/target/debug/moltis npx playwright test e2e/specs/chat-abort.spec.js
  • cargo +nightly-2025-11-30 clippy -p moltis-chat --all-features --tests -- -D warnings

Remaining

  • cargo +nightly-2025-11-30 clippy -Z unstable-options --workspace --all-features --all-targets --timings -- -D warnings
    blocked locally because llama-cpp-sys-2 configures GGML_CUDA=ON and this host does not have a CUDA toolkit available

Manual QA

  1. Start a long-running chat response that emits text and at least one tool call.
  2. Stop it before completion.
  3. Send continue.
  4. Confirm the aborted partial answer is still visible and the follow-up turn includes that partial context.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR implements abort-recovery persistence for chat sessions: streamed assistant text and reasoning are buffered in a new ActiveAssistantDraft per-session map so that when chat.abort fires, the partially visible output is saved to the session store and broadcast back to the client. It also adds an Assistant { tool_calls } frame before every ToolResult entry so that values_to_chat_messages can correctly reconstruct prompt history for tool-heavy turns. The web UI's handleChatAborted handler is extended to cache the partial message and re-render it in the DOM, and bumpSessionCount for tool-call-end events is updated from 1 → 2 to account for the new frame.

Key points:

  • The abort flow correctly resolves the session key before calling abort_run_handle, ensuring active_runs_by_session is still populated at lookup time.
  • Cancelled Tokio tasks are dropped at their current .await point, so the normal-completion cleanup code does not race with persist_partial_assistant_on_abort — the design is sound.
  • In both run_with_tools and run_explicit_shell_command, if appending the assistant tool-call frame fails, the code continues and appends the ToolResult anyway, creating an orphaned record that will break history replay on the next turn.
  • handleChatAborted now has enough branches to trigger a cognitive-complexity lint, which is suppressed rather than addressed through helper extraction.
  • The hasNonWhitespaceContent(partial?.audio || "") branch in hasVisiblePartial is dead code — the server never includes audio in partial messages.

Confidence Score: 4/5

  • Safe to merge with one logic gap addressed: orphaned tool results when the assistant frame persist fails.
  • The core abort-persist mechanism is architecturally correct and well-tested by both Rust integration tests and a new Playwright regression. The one functional concern — persisting a ToolResult even when the preceding assistant frame failed to write — is an edge case (storage failure) but would silently corrupt history replay. The JS changes are isolated to the abort handler and tool-call-end counter. Style issues (complexity suppression, dead audio check) are minor.
  • Pay close attention to the tool-call frame + result persistence paths in crates/chat/src/lib.rs (both run_with_tools and run_explicit_shell_command).

Important Files Changed

Filename Overview
crates/chat/src/lib.rs Core implementation of abort-persist feature: adds ActiveAssistantDraft buffer, persist_partial_assistant_on_abort, tool call frame persistence, and two new integration tests. Logic is sound but has a gap: orphaned tool results can be written if the assistant-frame append fails.
crates/web/src/assets/js/websocket.js handleChatAborted expanded with partial-message recovery logic; bumpSessionCount changed from 1→2 for tool-call ends. Function complexity lint suppressed; audio check in hasVisiblePartial is dead code; function mixes several concerns that could be split.
crates/web/ui/e2e/specs/chat-abort.spec.js New Playwright test validates that an aborted broadcast keeps the partial assistant output in the UI and in the session-history cache. Test coverage is solid for the happy path.
crates/chat/Cargo.toml async-stream added as a dev-dependency only, keeping it out of the production binary. No issues.

Sequence Diagram

sequenceDiagram
    participant C as Client JS
    participant S as LiveChatService
    participant DB as SessionStore

    C->>S: chat.send { text }
    S->>S: Insert ActiveAssistantDraft into active_partial_assistant
    S->>DB: append user_message
    loop Streaming
        S-->>C: chat delta events
        S->>S: draft.append_text / set_reasoning
    end
    S->>DB: append assistant_tool_call_frame
    S->>DB: append tool_result

    C->>S: chat.abort { sessionKey }
    S->>S: resolve_session_key_for_run
    S->>S: abort_run_handle — AbortHandle.abort, task dropped at current await
    S->>S: persist_partial_assistant_on_abort
    alt Draft has visible content
        S->>DB: append partial_assistant_message
        S->>DB: count session
        S-->>C: broadcast state=aborted with partialMessage and messageIndex
    else No visible content
        S-->>C: broadcast state=aborted no partial
    end
    S->>S: Remove active_thinking_text, active_tool_calls, active_reply_medium

    C->>C: handleChatAborted
    C->>C: cacheSessionHistoryMessage
    C->>C: bumpSessionCount if new messageIndex
    C->>C: Update DOM with partial content
Loading

Comments Outside Diff (1)

  1. crates/chat/src/lib.rs, line 5459-5467 (link)

    Orphaned tool result if frame persist fails

    If store_clone.append(&sk_persist, &assistant_tool_call_msg.to_value()) fails (the warn-and-continue path), the code immediately tries to persist tool_result_msg anyway. A ToolResult stored in the history without a preceding Assistant { tool_calls: [...] } frame is an orphan — values_to_chat_messages will skip or mis-classify it, meaning the tool execution is silently dropped from the prompt history on the next turn.

    The same pattern appears in the error branch of run_explicit_shell_command (~line 5346). Since the whole point of this PR is to make tool-call/result pairs round-trip correctly, it would be safer to skip the tool-result append when the frame append fails:

    let frame_ok = store_clone
        .append(&sk_persist, &assistant_tool_call_msg.to_value())
        .await
        .is_ok();
    if !frame_ok {
        warn!("failed to persist assistant tool call frame; skipping tool result to avoid orphan");
    } else if let Err(e) = store_clone
        .append(&sk_persist, &tool_result_msg.to_value())
        .await
    {
        warn!("failed to persist tool result: {e}");
    }

Last reviewed commit: 1846476

Comment thread crates/web/src/assets/js/websocket.js Outdated
Comment thread crates/web/src/assets/js/websocket.js Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1846476007

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/chat/src/lib.rs
Address PR #418 review feedback for abort recovery and tool history persistence.
@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 13, 2026

Merging this PR will not alter performance

✅ 39 untouched benchmarks
⏩ 5 skipped benchmarks1


Comparing issue-186 (4bb869a) with main (aff4b62)

Open in CodSpeed

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 88.76755% with 72 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/chat/src/lib.rs 88.76% 72 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbea499657

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/chat/src/lib.rs
Comment thread crates/chat/src/lib.rs
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4602943de1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/chat/src/lib.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 01bb8a0a2e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/chat/src/lib.rs
@penso penso merged commit 9f732b6 into main Mar 13, 2026
51 of 54 checks passed
@penso penso deleted the issue-186 branch March 13, 2026 23:48
penso added a commit that referenced this pull request Mar 15, 2026
The switchInProgress guard added in #418 drops delta and tool_call_end
events during session initialization.  This test was not updated to call
waitForChatSessionReady(), so the delta event was silently dropped and
the assistant message never rendered.

Entire-Checkpoint: 093cf8daea5f
penso added a commit that referenced this pull request Mar 23, 2026
* feat(skills): support safe agent-written sidecar files

* test(sandbox): make off-mode test deterministic

* fix(tools): address PR #413 review feedback for skill sidecar files

- Anchor confinement check to canonical skills root, not the skill dir
  itself, preventing symlinked skill roots from bypassing path checks
- Validate no symlinks in ancestry before create_dir_all to prevent
  out-of-tree directory creation via symlinked intermediates
- Add rollback of previously written files when a batch fails mid-way
- Surface audit log I/O failures via tracing::warn instead of silently
  swallowing them
- Add minItems/maxItems to the JSON schema so the LLM sees file count
  limits upfront
- Add tests for symlinked skill root rejection and batch rollback

Entire-Checkpoint: 916a7838ac8e

* test(web): wait for session ready before injecting tool events in e2e

The switchInProgress guard added in #418 drops delta and tool_call_end
events during session initialization.  This test was not updated to call
waitForChatSessionReady(), so the delta event was silently dropped and
the assistant message never rendered.

Entire-Checkpoint: 093cf8daea5f
penso added a commit that referenced this pull request Mar 23, 2026
* fix(chat): persist aborted partial history

* test(web): wait for chat session state before websocket injections

* fix(chat): close abort history review gaps

* test(web): make voice fallback rpc failure deterministic

* fix(chat): wait for pending tool history on abort
penso added a commit that referenced this pull request Mar 23, 2026
* feat(skills): support safe agent-written sidecar files

* test(sandbox): make off-mode test deterministic

* fix(tools): address PR #413 review feedback for skill sidecar files

- Anchor confinement check to canonical skills root, not the skill dir
  itself, preventing symlinked skill roots from bypassing path checks
- Validate no symlinks in ancestry before create_dir_all to prevent
  out-of-tree directory creation via symlinked intermediates
- Add rollback of previously written files when a batch fails mid-way
- Surface audit log I/O failures via tracing::warn instead of silently
  swallowing them
- Add minItems/maxItems to the JSON schema so the LLM sees file count
  limits upfront
- Add tests for symlinked skill root rejection and batch rollback

Entire-Checkpoint: 916a7838ac8e

* test(web): wait for session ready before injecting tool events in e2e

The switchInProgress guard added in #418 drops delta and tool_call_end
events during session initialization.  This test was not updated to call
waitForChatSessionReady(), so the delta event was silently dropped and
the assistant message never rendered.

Entire-Checkpoint: 093cf8daea5f
jmikedupont2 pushed a commit to meta-introspector/moltis that referenced this pull request Mar 23, 2026
* fix(chat): persist aborted partial history

* test(web): wait for chat session state before websocket injections

* fix(chat): close abort history review gaps

* test(web): make voice fallback rpc failure deterministic

* fix(chat): wait for pending tool history on abort
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: When stopping the agent before it's finished, none of what it did/said is remembered

1 participant