feat(voice): add animated waveform visualizer for voice mode state feedback#21115
feat(voice): add animated waveform visualizer for voice mode state feedback#21115ayush31010 wants to merge 31 commits intogoogle-gemini:mainfrom
Conversation
…edback Introduces AudioWaveform, an Ink/React component that renders a real-time Unicode block-character bar chart alongside a text label to reflect the current voice session state: - idle → renders nothing (null) - listening → green rippling sine-wave bars - processing → yellow pulsing bars (breathing effect) - speaking → cyan rippling sine-wave bars - error → red flat low bars (static) When no amplitudes are supplied the component generates a synthetic animation via setInterval (80 ms/tick). Caller-supplied amplitude arrays ([0, 1]) are resampled to the available bar count so the component works at any terminal width. Adds 15 unit tests covering: idle rendering, active-state output, state labels, block-character mapping, all-zero/all-one amplitude edge cases, animation diff over time, error-state staticness, and custom width. Closes google-gemini#21109
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new UI component designed to enhance the user experience for voice mode by providing clear, animated visual feedback on the current voice session state. It integrates a terminal-based audio waveform visualizer that dynamically adapts its appearance and animation based on whether the system is listening, processing, speaking, or encountering an error, making the voice interaction more intuitive. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new AudioWaveform component to provide visual feedback for voice session states, along with corresponding unit tests. The implementation is well-structured and the tests are comprehensive. I've found one high-severity issue where the component might render wider than its specified width, potentially breaking UI layouts. A code suggestion is provided to fix this.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
|
@Solventerritory, it would be great if you could also attach a mini clip of the waveform in action across multiple modes. |
Head branch was pushed to by a user without write access
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
spencer426
left a comment
There was a problem hiding this comment.
Holding off merge for UX review from @clocky.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
|
@Solventerritory, the maintainers also have their internal workflows and implementations to do because of which there might be delays in reviews and merges. Plus, it's the weekend. There's a huge list of PRs waiting for reviews, and merges. It's best that we remain patient :) |
This comment was marked as spam.
This comment was marked as spam.
|
@Solventerritory While we appreciate the PR, we also appreciate contributors being respectful of our team members. Please do not ping and tag folks repeatedly, everyone is working hard on this project and trying to manage the growth of pull requests and issues that come in, while also working on their own tasks and features. It is also the weekend, someone will get to this soon during work hours. |
|
@Solventerritory — Please keep all project communication strictly on GitHub. Reaching out via my personal website or external contact methods is inappropriate and will not expedite the review process. This PR is currently in the queue for UX review. As @spencer426 and @jackwotherspoon noted, we will provide feedback here during business hours as we align on the broader voice mode implementation and roadmap. Regarding GSoC 2026: I cannot provide individual assistance or review proposals. Please use official GSoC channels to ensure a fair process for all applicants. |
|
@Solventerritory - I am closing this PR and will not be providing further review. Despite my request (on a weekend) to keep all project-related communication within this repository, I have received over 30 direct emails to my personal address. This is an unacceptable violation of professional boundaries. I will not be engaging with any further work or communication from you on this project. Please do not contact me via personal channels again. |
Summary
Implements the
<AudioWaveform>Ink component proposed in #21109, providing visual feedback for the voice session lifecycle.Unicode block characters (
▁▂▃▄▅▆▇█) are mapped from amplitude values in [0, 1]. When no amplitudes are provided the component self-animates viasetInterval(80 ms/tick) using a synthetic sine wave, so it works before the audio pipeline is wired up.Caller-supplied amplitude arrays are resampled to fit
widthcolumns, keeping the total output (bars + label) within the requested terminal width.Test plan
renders nothing in idle staterenders a non-empty waveform in listening / processing / speaking / error state(4 parameterised)shows "<state>" label in <state> state(4 parameterised)renders unicode block characters for non-zero amplitudesuses all-low bars for all-zero amplitudesuses all-full bars for all-one amplitudesgenerates synthetic animation in listening state (frames differ over time)error state is static (no animation)respects a narrow widthAll 15 tests pass.
Related