fix(code-assist): filter thought parts before sending to Code Assist API#23048
fix(code-assist): filter thought parts before sending to Code Assist API#23048seheepeak wants to merge 1 commit intogoogle-gemini:mainfrom
Conversation
When resuming a session, convertSessionToClientHistory() reconstructs
model history including {thought: true, text: '...'} parts from saved
msg.thoughts summaries. These parts then pass through toPart() in
converter.ts, which converts them to {text: '...\n[Thought: true]'}
by stripping the thought flag and appending a literal marker string.
This corrupts the in-context history sent to the Code Assist API:
the model receives '[Thought: true]' as regular model text and learns
to emit the same marker in its own responses. The contamination then
self-propagates - subsequent responses also lack the thought:true flag
and get stored as regular content, so every session resume makes it
worse.
Fix by filtering thought parts before toPart() processes them in
toContent(). The comment already described this intent ('handle thought
filtering') but the filter predicate was missing.
Note: thought context cannot be meaningfully restored for Code Assist
(Vertex) on session resume regardless - thoughtSignature on live
response parts is the only valid carrier of thinking context for this
API path, and is not stored in session files.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the Code Assist integration where resuming a session would lead to the model's responses being polluted with Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to fix an issue where thought parts from resumed sessions are incorrectly processed by the Code Assist API, leading to model response contamination. The proposed change filters out these thought parts. However, the fix is incomplete as it only covers one of several code paths where thought parts are handled. This leaves the application still vulnerable to the original bug. Furthermore, the pull request is missing necessary updates to the test suite, which will fail with the current changes. I've left a critical comment detailing the missing pieces for a complete fix.
Note: Security Review did not run due to the size of the PR.
| ...content, | ||
| parts: content.parts | ||
| ? toParts(content.parts.filter((p) => p != null)) | ||
| ? toParts(content.parts.filter((p) => p != null && !p.thought)) |
There was a problem hiding this comment.
While this change correctly filters out thought parts for Content objects, the fix is incomplete. Other paths within the toContent function can still process thought parts, leading to the same contamination issue.
Specifically:
- When
toContentreceives aPartUnion[](theArray.isArray(content)case on line 207),toPartsis called without filtering, allowingthoughtparts to be converted to text bytoPart. - When
toContentreceives a singlePartwith a thought (the final case starting on line 230),toPartis called directly, again causing the unwanted conversion.
To fully resolve the issue, the thought filtering logic needs to be applied to all paths that process parts within this file. A more robust solution might be to centralize the filtering logic.
Additionally, this change will cause existing tests that check for thought-to-text conversion (e.g., should convert thought parts to text parts for API compatibility) to fail. The tests should be updated to reflect the new behavior of filtering out thoughts.
Root Cause AnalysisI use gemini-cli in ACP mode frequently and almost always resume previous sessions, so this is a particularly painful bug for me — every resumed session starts with the model leaking thought text into its responses. After digging into the gemini-code-assist bot's critique, I realized the issue goes deeper than a 2–3 line filter. I'm leaving my findings here in hopes it helps maintainers address this the right way. The actual originThe immediate trigger is // Added in e5d58c2b5
modelParts.push({
text: thoughtText,
thought: true,
} as Part);Before this commit, The contamination bug is therefore exclusive to session resume — live sessions are unaffected. This addition was likely unintentional
Given that the PR was focused on thinking UI rendering, if this reconstruction in convertSessionToClientHistory() was accidental, simply removing those lines would fix the problem for all auth paths. Worth noting: this also affects Direct Gemini API (API key) users on session resume — the official docs only mention thoughtSignature for multi-turn context continuity, not thought: true parts, so this is effectively an undocumented behavior. Two possible fixes
Would appreciate maintainer input on whether the |
Summary
Fixes #23046
When resuming a session,
convertSessionToClientHistory()reconstructs model history including{thought: true, text: '...'}parts from savedmsg.thoughtssummaries. For Code Assist (OAuth/Vertex) users, these parts pass throughtoPart()inconverter.ts, which converts{thought: true, text: 'desc'}to{text: 'desc\n[Thought: true]'}— stripping thethoughtflag and appending a literal marker string. The model then learns via in-context learning to emit[Thought: true]in its own responses, and the contamination self-propagates across turns.Details
toContent()already had a comment// handle thought filteringbut the filter predicate forthought: trueparts was missing. One-line fix:Thought context cannot be meaningfully restored for Code Assist on session resume regardless —
thoughtSignatureon live response parts is the only valid carrier of thinking context for this API path, and is not stored in session files. Filtering is therefore correct behavior, not just a workaround.API key users (Direct Gemini) are not affected as they bypass
converter.tsentirely.Related Issues
Fixes #23046
How to Validate
gemini-2.5-pro)gemini --resume)Before: Model responses contain
[Thought: true]literal text, worsening each turn.After: Model responds normally.
Pre-Merge Checklist