What would you like to be added?
PR #20779 introduced the voice mode architecture skeleton and PR #20923 added the Live API proof-of-concept. The GSoC Project 11 problem statement explicitly lists:
Visual feedback: animated waveform visualizer showing listening/speaking/processing states
No issue or PR currently tracks this component. This issue proposes implementing it as a standalone, testable Ink/React component that can be wired into the voice mode UI independently of the audio pipeline.
Proposed feature
Add an <AudioWaveform> Ink component that renders a real-time ASCII/block-character waveform and switches visual styles based on the current voice session state.
States to represent
| State |
Description |
Example render |
idle |
Voice mode inactive |
(hidden or static flat line) |
listening |
Mic open, capturing user speech |
Animated bars reacting to mic amplitude |
processing |
Audio sent, waiting for model |
Pulsing spinner or static compressed bars |
speaking |
Model audio playing back |
Animated bars reacting to playback amplitude |
error |
Session error |
Static red indicator |
Component API (proposed)
// packages/cli/src/ui/components/AudioWaveform.tsx
export type VoiceState = 'idle' | 'listening' | 'processing' | 'speaking' | 'error';
interface AudioWaveformProps {
state: VoiceState;
/** Amplitude samples in [0, 1], length determines bar count. Default: 20 bars */
amplitudes?: number[];
/** Width in terminal columns. Default: 40 */
width?: number;
}
Why is this needed?
-
No blocking dependencies: the component is pure UI and can be built and reviewed before the audio streaming pipeline is merged
-
Testable in isolation: ink-testing-library renders Ink components in a test environment without a real terminal
-
High visibility: it is the first thing users see when /voice is typed — good UX signal that the session is alive
Additional context
#20456 — RFC: Hands-Free Multimodal Voice Mode (architecture discussion)
#20779 — voice mode architecture skeleton (PR)
#20923 — voice mode Live API proof-of-concept (PR)
#20985 / #20989 — speech-friendly response formatter
What would you like to be added?
PR #20779 introduced the voice mode architecture skeleton and PR #20923 added the Live API proof-of-concept. The GSoC Project 11 problem statement explicitly lists:
No issue or PR currently tracks this component. This issue proposes implementing it as a standalone, testable Ink/React component that can be wired into the voice mode UI independently of the audio pipeline.
Proposed feature
Add an
<AudioWaveform>Ink component that renders a real-time ASCII/block-character waveform and switches visual styles based on the current voice session state.States to represent
idlelisteningprocessingspeakingerrorComponent API (proposed)
Why is this needed?
No blocking dependencies: the component is pure UI and can be built and reviewed before the audio streaming pipeline is merged
Testable in isolation: ink-testing-library renders Ink components in a test environment without a real terminal
High visibility: it is the first thing users see when /voice is typed — good UX signal that the session is alive
Additional context
#20456 — RFC: Hands-Free Multimodal Voice Mode (architecture discussion)
#20779 — voice mode architecture skeleton (PR)
#20923 — voice mode Live API proof-of-concept (PR)
#20985 / #20989 — speech-friendly response formatter