i2s_audio speaker: Add idle_mode: silence to keep I2S bus active and prevent pop/click on audio start #3604
Unanswered
nullable-eth
asked this question in
Component enhancements
Replies: 1 comment
-
|
🏷️ I've automatically added the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Component name
i2s_audio
Link to component documentation on our website
https://esphome.io/components/i2s_audio/
Describe the enhancement
Feature Request: I2S Speaker
idle_mode/ Keep-Alive to Prevent Pop on Audio StartThe Problem
Every time the
i2s_audiospeaker component starts playing audio, an audible pop/click is heard through the speaker. This happens because the I2S peripheral is fully torn down when idle and reinitialized on every audio event. The sudden transition from no clocks to active clocks causes a transient on the DAC output that produces the pop.This affects any board using an external codec + amplifier (ES8311 + NS4150B in my case, but also reported with UDA1334 and on the ESP32-S3-BOX-3). Related: esphome/issues#5739, esphome/issues#6676.
Hardware
Diagnostic Evidence
1. The pop correlates exactly with I2S start/stop
Log output shows the pop occurs at the moment of
i2s_audio.speaker: Starting:2. The codec is NOT being powered down
I added an interval timer that reads back ES8311 registers every 5 seconds — during idle, during playback, and after playback stops. The registers never change:
0x0D= power management (analog circuits, bias, DAC ref all enabled)0x00= chip state machine (CSM_ON=1, running)0x37= DAC ramp rate (fade in/out enabled, EQ bypassed)The codec stays fully powered the entire time. The pop is not a codec power-down issue.
3. The I2S bus idle timeout is ~4 seconds
Playing two tones back-to-back with varying gaps confirmed the speaker driver keeps the I2S bus alive for approximately 4 seconds after audio stops. Tones played within 4 seconds of each other produce no pop on the second tone. After 5+ seconds, the I2S bus has been torn down and the next audio event pops.
4. What was ruled out
timeout: 0son speaker platformRoot Cause
The
i2s_audiospeaker component callsi2s_channel_enable()/i2s_channel_disable()(or the legacy equivalents) each time audio starts and stops. When the I2S peripheral enables, it begins driving BCLK, LRCK, and MCLK. The ES8311 (and other codecs) see this sudden clock transition and the DAC output jumps from its idle DC level to its active level — producing the pop through the amplifier.The Espressif factory firmware on the ESP32-S3-BOX-3 does not exhibit this pop because ESP-ADF keeps the I2S pipeline alive and uses proper mute sequencing around start/stop.
Proposed Solution
Add a configuration option to the
i2s_audiospeaker (and/oraudio_dacplatform) to keep the I2S peripheral running when idle, outputting digital silence instead of tearing down the I2S driver.Something like:
Or alternatively, a
timeoutparameter that controls how long the I2S bus stays active after audio stops (with0sornevermeaning indefinitely):Implementation Notes
idle_mode: silenceis set, instead of callingi2s_channel_disable()after audio finishes, continue writing zero-filled buffers to the I2S DMAsetup()and never torn downtimeout: <duration>option could offer a middle ground — keep clocks running for N seconds after audio stops, then tear down (accepting the pop if audio restarts after the timeout)Current YAML Config (for reference)
Use cases
Any voice assistant satellite using an external codec + Class D amp (ES8311, ES7210, UDA1334, etc.) gets an audible pop/click every time audio starts — on wake word acknowledgment, TTS responses, media playback, and error sounds. For a voice assistant that triggers dozens of times a day, this makes the device sound broken and cheap despite quality hardware. The pop is especially jarring on the first interaction after idle since the I2S bus has been fully torn down. Keeping the I2S clocks running with digital silence eliminates this entirely, matching the behavior of Espressif's own factory firmware on the ESP32-S3-BOX-3 which does not exhibit the pop.
Anything else?
No response
Beta Was this translation helpful? Give feedback.
All reactions