feat(core): implement stale output elision for history pruning#21998
feat(core): implement stale output elision for history pruning#21998kunal-10-cloud wants to merge 5 commits intogoogle-gemini:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the chat history management by implementing stale output elision. This mechanism intelligently prunes the conversation history, specifically targeting tool outputs that become irrelevant after subsequent file modifications. By replacing these outdated entries with concise markers, the system efficiently manages the context window, ensuring the model operates with the most current information and reducing token usage. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a stale output elision service to prune chat history, optimizing context window usage by replacing original read output with a compact marker when a file is read and modified. While the service is well-integrated into the GeminiClient with thorough unit tests and telemetry, a security vulnerability was identified where the file path included in the elision marker is not sanitized, potentially leading to prompt injection if a malicious path is provided. Additionally, a high-severity issue related to path normalization was found, which could cause elision to fail when symbolic links are used. Both issues align with existing repository rules regarding prompt injection prevention and consistent path resolution/symlink handling.
2d5b2d8 to
e1212d7
Compare
|
Hi @SandyTao520, @abhipatel12, @jerop, @sehoon38, @gsquared94, @cynthialong0-0, @jacob314 please review this pr once and let me know if any changes are required |
…scovery, vision, and tasks
Introduces StaleOutputElisionService, which retroactively collapses stale tool read outputs from the chat history when the agent later modifies the same file. This saves context window tokens and prevents the model from being confused by outdated file content. - New StaleOutputElisionService with a two-pass O(n) algorithm: - Pass 1: builds a write-log from functionCall parts in model turns - Pass 2: identifies read outputs whose files were later written - Apply phase: immutable-update of stale parts with elision marker - Supports read_file, read_many_files (read side) and edit, write_file (write side) - Elision marker format: <stale_output_elided>[Content elided – file '...' was subsequently modified by <tool>]</stale_output_elided> - Guards: skips error responses, already-elided parts, outputs below MIN_TOKENS_TO_ELIDE (100 tokens) - Integrated in GeminiClient.processTurn() before tryMaskToolOutputs - New StaleOutputElisionEvent (BaseTelemetryEvent) in types.ts - New EventMetadataKey constants (IDs 172–173) in event-metadata-key.ts - STALE_OUTPUT_ELISION added to EventNames enum in clearcut-logger.ts - logStaleOutputElisionEvent() added to ClearcutLogger - logStaleOutputElision() function added to loggers.ts - 14 unit tests covering: basic elision (read_file/edit/write_file), read_many_files, no-op cases (wrong path, reversed order, no writes, empty history), token threshold guard, error skip, idempotency, immutability, multiple pairs, marker content Fixes #<session-continuity-epic-sub-issue-4>
7d52d90 to
64c3c75
Compare
|
@gsquared94, conflicts on this branch are resolved now. Can you please review this PR as well whenever you are available, since it has been open for quite a while |
Summary
Adds stale output elision to the history pipeline. When the agent reads a file and later modifies it, the original read output in history is automatically replaced with a compact marker, saving context window tokens and preventing the model from reasoning over outdated content.
Details
Uses a two-pass scan over chat history — one pass to identify write operations, a second to find read outputs that are now stale. Elision is immutable, idempotent, skips error responses, and only applies when the replacement is smaller than the original. Telemetry is included, mirroring the existing tool output masking pattern.
Related Issues
Closes #21891
How to Validate
npm test -w @google/gemini-cli-core -- src/services/staleOutputElisionService.test.ts— all 14 should pass.Pre-Merge Checklist