Skip to content

perf(core): fix OOM crash in long-running sessions#19608

Merged
jacob314 merged 5 commits intogoogle-gemini:mainfrom
WizardsForgeIo:perf/fix-oom-long-running-sessions
Mar 5, 2026
Merged

perf(core): fix OOM crash in long-running sessions#19608
jacob314 merged 5 commits intogoogle-gemini:mainfrom
WizardsForgeIo:perf/fix-oom-long-running-sessions

Conversation

@WizardsForgeGames
Copy link
Contributor

Summary

Fixes three compounding memory issues in packages/core that cause V8 heap exhaustion (FATAL ERROR: Reached heap limit) in long-running agentic sessions.

Fixes #19607

Changes

1. Replace structuredClone() with shallow array copy in getHistory()

GeminiChat.getHistory() was deep-cloning the entire conversation history on every call via structuredClone(). This is called on the hot path of every turn (building the API request), plus by loop detection, error reporting, compression, and several other callers.

In a long session with large tool outputs, this means deep-cloning 50-100MB+ of data multiple times per turn, creating massive transient allocations that compound into OOM.

Replaced with [...history] (shallow array copy). Audited all 8 production callers:

  • 6 are purely read-only — safe with shared Content references
  • ToolOutputMaskingService.mask() already creates its own shallow copies internally — safe
  • nextSpeakerChecker had a dead mutation (see below) — fixed

2. Remove dead Content mutation in nextSpeakerChecker.ts

Line 90 did lastComprehensiveMessage.parts.push({ text: '' }) on the getHistory() result, then immediately returned. With the old structuredClone, this pushed to a clone that was instantly discarded — a complete no-op. Removed the dead push so it can't accidentally mutate the real history now that getHistory() returns shared references.

3. Replace O(n) shift() with O(1) head pointer in event backlog

CoreEventEmitter._emitOrQueue() used Array.shift() to evict old entries when the 10K backlog was full. shift() on a 10K array is O(n) — copies ~10K elements on every eviction. Replaced with a head pointer that advances O(1), with amortized compaction via slice() when dead entries exceed half capacity. Same FIFO semantics, same 10K cap.

4. Cache Turn.getResponseText() result

getResponseText() recomputed .map().filter().join() over all streaming chunks on every call, but is invoked 2x per turn with identical input. Added a cache field so the second call is free.

Testing

  • All 208 tests across the 7 affected test files pass (geminiChat, turn, events, nextSpeakerChecker, client, toolOutputMaskingService, chatCompressionService)
  • TypeScript type-check passes (tsc --noEmit)
  • ESLint passes with 0 warnings on all changed files
  • Prettier formatting verified
  • npm run preflight lint step fails on win32/x64 due to platform-unsupported binary in scripts/lint.js (pre-existing, unrelated)

@WizardsForgeGames WizardsForgeGames requested a review from a team as a code owner February 20, 2026 04:18
@google-cla
Copy link

google-cla bot commented Feb 20, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @WizardsForgeGames, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses several compounding memory and performance issues within the packages/core module that were leading to Out-Of-Memory (OOM) crashes in long-running agentic sessions. The changes focus on reducing unnecessary deep copies of large data structures, eliminating dead code that could become problematic with memory optimizations, and improving the efficiency of array operations in critical paths. The overall impact is a more stable and performant application, particularly during extended usage.

Highlights

  • Memory Optimization in GeminiChat.getHistory(): Replaced structuredClone() with a shallow array copy ([...history]) in GeminiChat.getHistory() to prevent deep-cloning the entire conversation history on every call, significantly reducing transient memory allocations and preventing OOM crashes.
  • Removed Dead Mutation in nextSpeakerChecker.ts: Eliminated a no-op mutation (lastComprehensiveMessage.parts.push({ text: '' })) in nextSpeakerChecker.ts that previously operated on a discarded deep copy, ensuring correctness with the new shallow history copy.
  • Optimized Event Backlog Eviction: Replaced the O(n) Array.shift() operation with an O(1) head pointer mechanism in CoreEventEmitter._emitOrQueue() for evicting old entries from the event backlog, improving performance for long-running event queues.
  • Cached Turn.getResponseText() Result: Introduced caching for the getResponseText() method in the Turn class to avoid redundant computations, as this method was called multiple times per turn with identical input.
Changelog
  • packages/core/src/core/geminiChat.ts
    • Replaced structuredClone(history) with [...history] in getHistory() to perform a shallow array copy instead of a deep clone, reducing memory overhead.
  • packages/core/src/core/turn.ts
    • Added a private cachedResponseText field to store the result of getResponseText().
    • Modified getResponseText() to cache its computed value, returning the cached result on subsequent calls within the same turn.
  • packages/core/src/utils/events.ts
    • Added a private _backlogHead property to track the head of the event backlog.
    • Refactored _emitOrQueue() to use the _backlogHead for O(1) eviction of old entries, avoiding Array.shift().
    • Implemented amortized compaction via slice() when dead entries in the backlog exceed half capacity.
    • Updated drainBacklogs() to iterate from the _backlogHead and reset the backlog and head pointer.
  • packages/core/src/utils/nextSpeakerChecker.ts
    • Removed the line lastComprehensiveMessage.parts.push({ text: '' }) as it was a dead mutation on a discarded deep copy and could cause unintended side effects with shared references.
Activity
  • All 208 tests across 7 affected test files (geminiChat, turn, events, nextSpeakerChecker, client, toolOutputMaskingService, chatCompressionService) passed.
  • TypeScript type-check passed (tsc --noEmit).
  • ESLint passed with 0 warnings on all changed files.
  • Prettier formatting was verified.
  • The npm run preflight lint step failed on win32/x64 due to a pre-existing, unrelated issue with a platform-unsupported binary in scripts/lint.js.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several significant performance improvements to address out-of-memory issues in long-running sessions. The changes, including replacing structuredClone with a shallow copy, optimizing the event backlog with a head pointer, and removing a dead mutation, are well-thought-out and correctly implemented. Overall, this is an excellent set of optimizations.

Replace structuredClone() in getHistory() with a shallow array copy.
The deep clone ran on every turn's hot path, duplicating the entire
conversation history (50-100MB+ in long sessions) multiple times per
turn. Verified all 8 production callers — only one mutated Content
objects (nextSpeakerChecker), and that mutation was dead code (pushed
to a clone that was immediately discarded).

Replace Array.shift() in the event backlog with a head pointer and
amortized compaction. shift() on a 10K-element array is O(n); the
new approach is O(1) amortized with identical FIFO semantics.

Cache Turn.getResponseText() result to avoid redundant recomputation
across its two call sites per turn.

Fixes google-gemini#19607
@WizardsForgeGames WizardsForgeGames force-pushed the perf/fix-oom-long-running-sessions branch from 1002b9f to 65cd7df Compare February 20, 2026 04:24
@gemini-cli gemini-cli bot added the area/core Issues related to User Interface, OS Support, Core Functionality label Feb 20, 2026
@gemini-cli gemini-cli bot added priority/p1 Important and should be addressed in the near term. help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! labels Mar 5, 2026
Copy link
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Solid optimizations to reduce V8 heap pressure. Replaced the unknown[] cast with strict typing for the _eventBacklog array to follow strict development rules.

@jacob314
Copy link
Contributor

jacob314 commented Mar 5, 2026

I'm stripping out changes that I don't think were needed from this PR and landing as
#21207

@jacob314 jacob314 closed this Mar 5, 2026
@jacob314 jacob314 reopened this Mar 5, 2026
if (this.listenerCount(event) === 0) {
if (this._eventBacklog.length >= CoreEventEmitter.MAX_BACKLOG_SIZE) {
this._eventBacklog.shift();
const backlogSize = this._eventBacklog.length - this._backlogHead;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine but not really needed as there should not be an enormous backlog in typical cases

Copy link
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jacob314 jacob314 added this pull request to the merge queue Mar 5, 2026
Merged via the queue into google-gemini:main with commit 3d4956a Mar 5, 2026
51 checks passed
@neeraj-par neeraj-par mentioned this pull request Mar 5, 2026
theerud pushed a commit to theerud/gemini-cli that referenced this pull request Mar 6, 2026
Co-authored-by: Jacob Richman <jacob314@gmail.com>
struckoff pushed a commit to struckoff/gemini-cli that referenced this pull request Mar 6, 2026
Co-authored-by: Jacob Richman <jacob314@gmail.com>
kunal-10-cloud pushed a commit to kunal-10-cloud/gemini-cli that referenced this pull request Mar 12, 2026
Co-authored-by: Jacob Richman <jacob314@gmail.com>
liamhelmer pushed a commit to badal-io/gemini-cli that referenced this pull request Mar 12, 2026
Co-authored-by: Jacob Richman <jacob314@gmail.com>
yashodipmore pushed a commit to yashodipmore/geemi-cli that referenced this pull request Mar 21, 2026
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Long-running sessions crash with OOM due to structuredClone and O(n) event backlog eviction

2 participants