perf(core): fix OOM crash in long-running sessions#19608
perf(core): fix OOM crash in long-running sessions#19608jacob314 merged 5 commits intogoogle-gemini:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @WizardsForgeGames, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses several compounding memory and performance issues within the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces several significant performance improvements to address out-of-memory issues in long-running sessions. The changes, including replacing structuredClone with a shallow copy, optimizing the event backlog with a head pointer, and removing a dead mutation, are well-thought-out and correctly implemented. Overall, this is an excellent set of optimizations.
Replace structuredClone() in getHistory() with a shallow array copy. The deep clone ran on every turn's hot path, duplicating the entire conversation history (50-100MB+ in long sessions) multiple times per turn. Verified all 8 production callers — only one mutated Content objects (nextSpeakerChecker), and that mutation was dead code (pushed to a clone that was immediately discarded). Replace Array.shift() in the event backlog with a head pointer and amortized compaction. shift() on a 10K-element array is O(n); the new approach is O(1) amortized with identical FIFO semantics. Cache Turn.getResponseText() result to avoid redundant recomputation across its two call sites per turn. Fixes google-gemini#19607
1002b9f to
65cd7df
Compare
jacob314
left a comment
There was a problem hiding this comment.
Looks great! Solid optimizations to reduce V8 heap pressure. Replaced the unknown[] cast with strict typing for the _eventBacklog array to follow strict development rules.
|
I'm stripping out changes that I don't think were needed from this PR and landing as |
| if (this.listenerCount(event) === 0) { | ||
| if (this._eventBacklog.length >= CoreEventEmitter.MAX_BACKLOG_SIZE) { | ||
| this._eventBacklog.shift(); | ||
| const backlogSize = this._eventBacklog.length - this._backlogHead; |
There was a problem hiding this comment.
this is fine but not really needed as there should not be an enormous backlog in typical cases
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>
Co-authored-by: Jacob Richman <jacob314@gmail.com>

Summary
Fixes three compounding memory issues in
packages/corethat cause V8 heap exhaustion (FATAL ERROR: Reached heap limit) in long-running agentic sessions.Fixes #19607
Changes
1. Replace
structuredClone()with shallow array copy ingetHistory()GeminiChat.getHistory()was deep-cloning the entire conversation history on every call viastructuredClone(). This is called on the hot path of every turn (building the API request), plus by loop detection, error reporting, compression, and several other callers.In a long session with large tool outputs, this means deep-cloning 50-100MB+ of data multiple times per turn, creating massive transient allocations that compound into OOM.
Replaced with
[...history](shallow array copy). Audited all 8 production callers:ToolOutputMaskingService.mask()already creates its own shallow copies internally — safenextSpeakerCheckerhad a dead mutation (see below) — fixed2. Remove dead Content mutation in
nextSpeakerChecker.tsLine 90 did
lastComprehensiveMessage.parts.push({ text: '' })on thegetHistory()result, then immediately returned. With the oldstructuredClone, this pushed to a clone that was instantly discarded — a complete no-op. Removed the dead push so it can't accidentally mutate the real history now thatgetHistory()returns shared references.3. Replace O(n)
shift()with O(1) head pointer in event backlogCoreEventEmitter._emitOrQueue()usedArray.shift()to evict old entries when the 10K backlog was full.shift()on a 10K array is O(n) — copies ~10K elements on every eviction. Replaced with a head pointer that advances O(1), with amortized compaction viaslice()when dead entries exceed half capacity. Same FIFO semantics, same 10K cap.4. Cache
Turn.getResponseText()resultgetResponseText()recomputed.map().filter().join()over all streaming chunks on every call, but is invoked 2x per turn with identical input. Added a cache field so the second call is free.Testing
tsc --noEmit)npm run preflightlint step fails on win32/x64 due to platform-unsupported binary inscripts/lint.js(pre-existing, unrelated)