Skip to content

fix(core): add in-memory cache to ChatRecordingService to prevent OOM#21502

Merged
SandyTao520 merged 5 commits intomainfrom
st/fix/chat-recording-oom-memory-cache
Mar 7, 2026
Merged

fix(core): add in-memory cache to ChatRecordingService to prevent OOM#21502
SandyTao520 merged 5 commits intomainfrom
st/fix/chat-recording-oom-memory-cache

Conversation

@SandyTao520
Copy link
Contributor

Summary

Fix OOM crash in long-running sessions (~53 min) caused by ChatRecordingService performing expensive disk I/O and double JSON.stringify on every streaming chunk via recordMessageTokens.

Details

The updateConversation hot path executed on every call (including per-chunk during streaming):

  1. fs.readFileSync — full conversation JSON from disk
  2. JSON.parse — parse entire string into objects
  3. JSON.stringify docs: Add setup instructions for API key to README #1 — serialize for comparison check
  4. JSON.stringify Improve readability issues #2 — serialize again for writing
  5. fs.writeFileSync — write to disk

In long sessions with large tool outputs (file reads, grep results, command outputs), the conversation object grows to hundreds of MBs. The repeated full serialization exhausts the 4GB V8 heap.

Changes

  • In-memory cache (cachedConversation): Eliminates redundant fs.readFileSync + JSON.parse on every update. After the first read, all subsequent operations use the cached object.
  • Single serialize: Removed the comparison JSON.stringify in writeConversation. Previously it serialized once for comparison, then again for writing. Now only serializes once.
  • Skip disk I/O in recordMessageTokens queuing path: When the last message already has tokens (the common case during streaming), tokens are queued in memory with zero serialization and zero disk I/O.

Related Issues

Related to #18007

How to Validate

  1. Run the updated test suite: npm test -w @google/gemini-cli-core -- src/services/chatRecordingService.test.ts
  2. All 28 tests pass, including 4 new tests that verify:
    • No disk writes when queuing tokens (no gemini message)
    • No disk writes when queuing tokens (message already has tokens)
    • In-memory cache eliminates disk reads on subsequent operations
  3. Start a long session with heavy tool use and verify no OOM crash

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

The updateConversation hot path performed disk read + JSON.parse +
double JSON.stringify on every call, including during streaming chunk
processing via recordMessageTokens. In long sessions (~53 min) with
large conversation histories, this caused heap exhaustion at ~4GB.

Changes:
- Cache ConversationRecord in memory, eliminating redundant disk reads
- Remove comparison JSON.stringify in writeConversation (single serialize)
- Skip all disk I/O in recordMessageTokens when just queuing tokens
@SandyTao520 SandyTao520 requested a review from a team as a code owner March 7, 2026 00:36
@gemini-cli
Copy link
Contributor

gemini-cli bot commented Mar 7, 2026

Hi @SandyTao520, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this.

We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines.

Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed.

Thank you for your understanding and for being a part of our community!

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an Out-Of-Memory (OOM) crash occurring in long-running chat sessions due to inefficient disk I/O and excessive JSON serialization within the ChatRecordingService. By implementing an in-memory conversation cache, optimizing JSON serialization to a single step, and intelligently skipping disk writes for token updates, the changes drastically reduce resource consumption and improve performance, particularly for large conversation objects.

Highlights

  • In-memory Caching: Introduced a cachedConversation to eliminate redundant fs.readFileSync and JSON.parse operations, significantly reducing disk I/O on subsequent conversation updates.
  • Optimized Serialization: Refactored writeConversation to perform JSON.stringify only once, removing a duplicate serialization step that was previously used for comparison.
  • Reduced Disk Writes for Tokens: Modified recordMessageTokens to queue tokens in memory without triggering disk writes when the last message already contains token information, further reducing I/O.
Changelog
  • packages/core/src/services/chatRecordingService.test.ts
    • Added tests to verify that recordMessageTokens does not write to disk when no Gemini message is present or when the last message already has tokens.
    • Added a test to confirm that subsequent operations use the in-memory cache, avoiding redundant disk reads.
  • packages/core/src/services/chatRecordingService.ts
    • Introduced cachedConversation property to store the conversation object in memory.
    • Modified initialize and error handling to clear cachedConversation when necessary.
    • Refactored recordMessageTokens to directly read and write the conversation, and to only queue tokens in memory if the last message already has token data, avoiding disk writes.
    • Updated readConversation to return the cachedConversation if available, otherwise read from disk and populate the cache.
    • Revised writeConversation to always update the cachedConversation and remove the conditional disk write based on cachedLastConvData comparison, ensuring a single serialization.
Activity
  • No human activity (comments, reviews, etc.) has occurred on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the Out-of-Memory issue in ChatRecordingService by introducing an in-memory cache for the conversation record. The changes correctly eliminate redundant disk reads and expensive JSON serializations, especially during token streaming, which was the primary cause of the problem. The new tests you've added provide good coverage for the caching logic and performance improvements.

I've found one critical edge case where a corrupted session file could lead to a runtime crash. My review includes a specific comment with a suggestion to handle this gracefully.

@github-actions
Copy link

github-actions bot commented Mar 7, 2026

Size Change: +1.1 kB (0%)

Total Size: 26 MB

Filename Size Change
./bundle/gemini.js 25.5 MB +1.1 kB (0%)
ℹ️ View Unchanged
Filename Size
./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js 221 kB
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js 227 kB
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js 11.5 kB
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js 132 B
./bundle/sandbox-macos-permissive-open.sb 890 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB
./bundle/sandbox-macos-strict-open.sb 4.82 kB
./bundle/sandbox-macos-strict-proxied.sb 5.02 kB

compressed-size-action

@gemini-cli gemini-cli bot added the priority/p1 Important and should be addressed in the near term. label Mar 7, 2026
this.cachedConversation = conversation;
conversation.lastUpdated = new Date().toISOString();
const newContent = JSON.stringify(conversation, null, 2);
this.cachedLastConvData = newContent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't we still want to compare if they are equal so we could skip writing if they are same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeConversation is now only called by code paths that have already mutated the conversation object (e.g. recordMessage, recordToolCalls, recordMessageTokens when it actually sets tokens on the last message). The queuing-only path in recordMessageTokens no longer calls writeConversation at all. So the comparison would always find a diff and never skip.

Add null check after JSON.parse to gracefully handle corrupted session
files (e.g. containing "null"), falling back to an empty conversation
instead of crashing via non-null assertion.
Copy link
Contributor

@sehoon38 sehoon38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Restore the comparison check to skip disk writes when the conversation
was not actually mutated (e.g. updateMessagesFromHistory with no
matching tool calls). Compare the serialized content against the cached
string before updating lastUpdated to prevent false diffs from
timestamp changes.
- Add doc comment on readConversation noting the returned object is
  the live cache reference and mutations affect future reads.
- Move cachedConversation assignment after the comparison check in
  writeConversation to avoid unnecessary reference swap on no-op path.
@SandyTao520 SandyTao520 enabled auto-merge March 7, 2026 03:34
@SandyTao520 SandyTao520 added this pull request to the merge queue Mar 7, 2026
Merged via the queue into main with commit 9455ecd Mar 7, 2026
27 checks passed
@SandyTao520 SandyTao520 deleted the st/fix/chat-recording-oom-memory-cache branch March 7, 2026 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants