Skip to content

feat(voice): add animated waveform visualizer for voice mode state feedback#21115

Closed
ayush31010 wants to merge 31 commits intogoogle-gemini:mainfrom
ayush31010:feat/voice-waveform-visualizer
Closed

feat(voice): add animated waveform visualizer for voice mode state feedback#21115
ayush31010 wants to merge 31 commits intogoogle-gemini:mainfrom
ayush31010:feat/voice-waveform-visualizer

Conversation

@ayush31010
Copy link
Copy Markdown

@ayush31010 ayush31010 commented Mar 4, 2026

Summary

Implements the <AudioWaveform> Ink component proposed in #21109, providing visual feedback for the voice session lifecycle.

  • idle → renders nothing (null)
  • listening → green rippling sine-wave bars
  • processing → yellow pulsing bars (breathing effect)
  • speaking → cyan rippling sine-wave bars
  • error → red flat low bars (static)

Unicode block characters (▁▂▃▄▅▆▇█) are mapped from amplitude values in [0, 1]. When no amplitudes are provided the component self-animates via setInterval (80 ms/tick) using a synthetic sine wave, so it works before the audio pipeline is wired up.

Caller-supplied amplitude arrays are resampled to fit width columns, keeping the total output (bars + label) within the requested terminal width.

Test plan

  • renders nothing in idle state
  • renders a non-empty waveform in listening / processing / speaking / error state (4 parameterised)
  • shows "<state>" label in <state> state (4 parameterised)
  • renders unicode block characters for non-zero amplitudes
  • uses all-low bars for all-zero amplitudes
  • uses all-full bars for all-one amplitudes
  • generates synthetic animation in listening state (frames differ over time)
  • error state is static (no animation)
  • respects a narrow width

All 15 tests pass.

Related

Ayush Debnath added 2 commits March 4, 2026 20:55
…edback

Introduces AudioWaveform, an Ink/React component that renders a real-time
Unicode block-character bar chart alongside a text label to reflect the
current voice session state:

- idle       → renders nothing (null)
- listening  → green rippling sine-wave bars
- processing → yellow pulsing bars (breathing effect)
- speaking   → cyan rippling sine-wave bars
- error      → red flat low bars (static)

When no amplitudes are supplied the component generates a synthetic
animation via setInterval (80 ms/tick). Caller-supplied amplitude arrays
([0, 1]) are resampled to the available bar count so the component works
at any terminal width.

Adds 15 unit tests covering: idle rendering, active-state output, state
labels, block-character mapping, all-zero/all-one amplitude edge cases,
animation diff over time, error-state staticness, and custom width.

Closes google-gemini#21109
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new UI component designed to enhance the user experience for voice mode by providing clear, animated visual feedback on the current voice session state. It integrates a terminal-based audio waveform visualizer that dynamically adapts its appearance and animation based on whether the system is listening, processing, speaking, or encountering an error, making the voice interaction more intuitive.

Highlights

  • New Component: Introduced an <AudioWaveform> Ink component to provide visual feedback for voice mode states.
  • State-based Visuals: Implemented distinct visual representations for idle, listening, processing, speaking, and error voice states, including specific colors and animation patterns.
  • Unicode Graphics: Utilized Unicode block characters (▁▂▃▄▅▆▇█) to render dynamic waveform bars based on amplitude values.
  • Synthetic Animation: Added functionality to generate synthetic animated waveforms (rippling or pulsing) when no amplitude data is provided, ensuring continuous visual feedback.
  • Adaptive Display: Included logic to resample caller-supplied amplitude arrays to fit the specified terminal width, maintaining a consistent display.
Changelog
  • packages/cli/src/ui/components/AudioWaveform.test.tsx
    • Added a comprehensive test suite for the AudioWaveform component.
    • Tests cover rendering for all voice states, label display, Unicode character mapping, amplitude handling, and animation behavior.
    • Includes tests for synthetic animation and static error states.
    • Verifies width responsiveness and correct rendering of amplitude extremes (zero and one).
  • packages/cli/src/ui/components/AudioWaveform.tsx
    • Added the AudioWaveform React component for Ink.
    • Defined VoiceState types and AudioWaveformProps interface.
    • Implemented state-dependent colors and labels.
    • Included logic for generating synthetic animations (rippling sine wave for listening/speaking, pulsing for processing).
    • Provided a utility function to map amplitude values to Unicode block characters.
    • Handled amplitude resampling to fit the specified terminal width.
    • Renders nothing in the 'idle' state and a dynamic waveform with a label in other states.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new AudioWaveform component to provide visual feedback for voice session states, along with corresponding unit tests. The implementation is well-structured and the tests are comprehensive. I've found one high-severity issue where the component might render wider than its specified width, potentially breaking UI layouts. A code suggestion is provided to fix this.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@ayush31010

This comment was marked as spam.

@gemini-cli gemini-cli bot added area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! labels Mar 4, 2026
@ayush31010

This comment was marked as spam.

@ayush31010

This comment was marked as spam.

@spencer426 spencer426 enabled auto-merge March 11, 2026 04:43
@M-DEV-1
Copy link
Copy Markdown

M-DEV-1 commented Mar 11, 2026

@Solventerritory, it would be great if you could also attach a mini clip of the waveform in action across multiple modes.

auto-merge was automatically disabled March 11, 2026 12:56

Head branch was pushed to by a user without write access

@ayush31010

This comment was marked as spam.

@ayush31010

This comment was marked as spam.

@ayush31010

This comment was marked as spam.

@ayush31010

This comment was marked as spam.

@spencer426 spencer426 self-requested a review March 14, 2026 00:04
Copy link
Copy Markdown
Contributor

@spencer426 spencer426 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for final UX approval but code looks good thanks for PR!

@clocky clocky self-assigned this Mar 14, 2026
@spencer426 spencer426 self-requested a review March 14, 2026 00:23
Copy link
Copy Markdown
Contributor

@spencer426 spencer426 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off merge for UX review from @clocky.

@ayush31010

This comment was marked as spam.

@ayush31010

This comment was marked as spam.

@ayush31010

This comment was marked as spam.

@M-DEV-1
Copy link
Copy Markdown

M-DEV-1 commented Mar 15, 2026

@Solventerritory, the maintainers also have their internal workflows and implementations to do because of which there might be delays in reviews and merges. Plus, it's the weekend.

There's a huge list of PRs waiting for reviews, and merges. It's best that we remain patient :)

@ayush31010

This comment was marked as spam.

@jackwotherspoon
Copy link
Copy Markdown
Collaborator

@Solventerritory While we appreciate the PR, we also appreciate contributors being respectful of our team members.

Please do not ping and tag folks repeatedly, everyone is working hard on this project and trying to manage the growth of pull requests and issues that come in, while also working on their own tasks and features.

It is also the weekend, someone will get to this soon during work hours.

@clocky
Copy link
Copy Markdown
Contributor

clocky commented Mar 15, 2026

@Solventerritory — Please keep all project communication strictly on GitHub. Reaching out via my personal website or external contact methods is inappropriate and will not expedite the review process.

This PR is currently in the queue for UX review. As @spencer426 and @jackwotherspoon noted, we will provide feedback here during business hours as we align on the broader voice mode implementation and roadmap.

Regarding GSoC 2026: I cannot provide individual assistance or review proposals. Please use official GSoC channels to ensure a fair process for all applicants.

@clocky
Copy link
Copy Markdown
Contributor

clocky commented Mar 17, 2026

@Solventerritory - I am closing this PR and will not be providing further review.

Despite my request (on a weekend) to keep all project-related communication within this repository, I have received over 30 direct emails to my personal address. This is an unacceptable violation of professional boundaries.

I will not be engaging with any further work or communication from you on this project. Please do not contact me via personal channels again.

@clocky clocky closed this Mar 17, 2026
@google-gemini google-gemini locked as spam and limited conversation to collaborators Mar 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(voice): add animated waveform visualizer for voice mode state feedback

5 participants