Skip to content

feat(voice): add animated waveform visualizer for voice mode state feedback #21109

@ayush31010

Description

@ayush31010

What would you like to be added?

PR #20779 introduced the voice mode architecture skeleton and PR #20923 added the Live API proof-of-concept. The GSoC Project 11 problem statement explicitly lists:

Visual feedback: animated waveform visualizer showing listening/speaking/processing states

No issue or PR currently tracks this component. This issue proposes implementing it as a standalone, testable Ink/React component that can be wired into the voice mode UI independently of the audio pipeline.


Proposed feature

Add an <AudioWaveform> Ink component that renders a real-time ASCII/block-character waveform and switches visual styles based on the current voice session state.

States to represent

State Description Example render
idle Voice mode inactive (hidden or static flat line)
listening Mic open, capturing user speech Animated bars reacting to mic amplitude
processing Audio sent, waiting for model Pulsing spinner or static compressed bars
speaking Model audio playing back Animated bars reacting to playback amplitude
error Session error Static red indicator

Component API (proposed)

// packages/cli/src/ui/components/AudioWaveform.tsx

export type VoiceState = 'idle' | 'listening' | 'processing' | 'speaking' | 'error';

interface AudioWaveformProps {
  state: VoiceState;
  /** Amplitude samples in [0, 1], length determines bar count. Default: 20 bars */
  amplitudes?: number[];
  /** Width in terminal columns. Default: 40 */
  width?: number;
}

Why is this needed?

  • No blocking dependencies: the component is pure UI and can be built and reviewed before the audio streaming pipeline is merged

  • Testable in isolation: ink-testing-library renders Ink components in a test environment without a real terminal

  • High visibility: it is the first thing users see when /voice is typed — good UX signal that the session is alive

Additional context

#20456 — RFC: Hands-Free Multimodal Voice Mode (architecture discussion)
#20779 — voice mode architecture skeleton (PR)
#20923 — voice mode Live API proof-of-concept (PR)
#20985 / #20989 — speech-friendly response formatter

Metadata

Metadata

Assignees

Labels

area/coreIssues related to User Interface, OS Support, Core Functionality🔒 maintainer only⛔ Do not contribute. Internal roadmap item.
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions