feat(voice): implement speech-friendly response formatter#20989
feat(voice): implement speech-friendly response formatter#20989spencer426 merged 18 commits intogoogle-gemini:mainfrom
Conversation
Adds packages/core/src/voice/responseFormatter.ts with a
formatForSpeech() function that converts markdown/ANSI-formatted tool
output and LLM responses into speech-clean plain text suitable for
TTS playback in voice mode.
Transformations applied (in order):
1. Strip ANSI escape codes (color, bold, cursor movement)
2. Unwrap fenced code blocks; summarise large JSON content as
"(JSON object with N keys)" / "(JSON array with N items)"
3. Collapse Node.js stack traces to first frame + "(and N more frames)"
4. Strip markdown syntax: bold, italic, inline code, blockquotes,
headings, links, unordered/ordered list markers
5. Abbreviate deep absolute Unix and Windows paths to last pathDepth
segments prefixed with "…"; convert ":142" suffixes to "line 142"
6. Normalise whitespace (collapse excess blank lines, trim)
7. Truncate to maxLength with "… (N chars total)" suffix
Public API:
formatForSpeech(text, options?) → string
options: { maxLength?, pathDepth?, jsonThreshold? }
All defaults chosen to produce natural-sounding output from typical
tool results without requiring any new runtime dependencies.
Closes google-gemini#20985
Related: google-gemini#20779 (voice mode skeleton), google-gemini#20456 (RFC)
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new utility function, Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable utility for converting formatted text into a speech-friendly format. While no security vulnerabilities were found, critical bugs were identified in the implementation related to Windows path handling and stack trace collapsing, which can lead to incorrect or malformed output. A less severe issue with markdown parsing for bold/italic text was also noted. Detailed suggestions and code snippets are provided to address these, along with recommendations for additional test cases to prevent future regressions.
The WIN_PATH_RE replacement callback previously hardcoded 'C:\' as the path prefix, so paths on any other drive (D:\, E:\, etc.) were silently reconstructed as C: paths and abbreviated incorrectly. Fix: capture the full path (drive letter included) as a single regex group, matching the pattern already used by UNIX_PATH_RE, and pass it directly to abbreviatePath() without any manual prefix concatenation. Also adds two missing test cases identified in review: - Windows path abbreviation on a non-C drive (D:\...) - Stack trace collapsing preserves surrounding text before and after the trace frames Addresses review feedback on google-gemini#20989
This comment was marked as spam.
This comment was marked as spam.
Scoped package paths like @google/gemini-cli-core were only matched up to the @ character, producing broken TTS output. Adding @ to the character class fixes the match and a test case is added to cover it.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
spencer426
left a comment
There was a problem hiding this comment.
Automated Code Review
spencer426
left a comment
There was a problem hiding this comment.
Here are a few specific issues with the implementation that need to be addressed.
- Update copyright year to 2026 in responseFormatter.ts and test file - Fix BOLD_ITALIC_RE to exclude newlines, preventing cross-line matches that could consume list markers before they are stripped - Fix stack trace collapsing to replace frames in-place (STACK_BLOCK_RE) instead of stripping all frames and appending summary at end, which was mangling text that followed the trace - Export formatForSpeech and FormatForSpeechOptions from packages/core index
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
spencer426
left a comment
There was a problem hiding this comment.
Please address the Polynomial regular expression used on uncontrolled data issue
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
…ini#20989) Co-authored-by: Spencer <spencertang@google.com>
…ini#20989) Co-authored-by: Spencer <spencertang@google.com>
This comment was marked as spam.
This comment was marked as spam.
…ini#20989) Co-authored-by: Spencer <spencertang@google.com>
Summary
Closes #20985
Related: #20779 (voice mode skeleton), #20456 (RFC)
packages/core/src/voice/responseFormatter.tswithformatForSpeech()— a pure-TypeScript utility that converts markdown/ANSI-formatted output into speech-clean plain text for TTS playback in voice modeTransformations (applied in order)
\x1b[31mError\x1b[0mError```json\n{...}\n```(JSON object with N keys)(and N more frames)**bold**,*italic*,`code`> blockquote,# Heading,[link](url)/home/user/project/src/tools/file.ts:142…/src/tools/file.ts line 142… (N chars total)Test plan
npx vitest run packages/core/src/voice/responseFormatter.test.ts # ✓ 33 tests passed