Skip to content

Proactive silence tracking for stalled CATap/mic streams #8

@tenequm

Description

@tenequm

Summary

When macOS stalls output device routing mid-call (e.g. headphone unplug coinciding with call end), the CATap IO proc stops delivering samples for many seconds. During the stall the system track's nextExpected stays stuck at the last real sample, while the mic track keeps flowing. At stop(), padTailIfNeeded must fill the entire stall as tail silence in one burst; AVAssetWriter backpressure trips D8 "break immediately" behavior mid-fill and residual seconds are lost. Tracks end up desynced by the length of the macOS stall.

Evidence

Recording 2026-04-17-172342-6C98 (v0.7.0 CATap migration build, refactor/catap-migration).

Log excerpt:

16:27:33 output device changed, rebuilding aggregate device
16:27:33 aggregate device rebuilt after output device change (new output: BuiltInSpeakerDevice)
16:27:33 call state changed: activeCallers=0, running=false  (Chrome call ended same second)
16:27:37 mic capture recovered, filled 3.978s gap
16:27:37 drift: sys_age=4134.3ms mic_age=196.1ms mic-sys=+3938.2ms
16:27:42 drift: sys_age=9136.3ms  (no system samples for 9s)
16:27:47 drift: sys_age=14133.6ms  (14s)
16:27:51 stop() for Google Chrome
16:27:51 silence fill interrupted by back-pressure at 537/821 chunks
16:27:51 ERROR system: tail padding short by 290834 samples (6.059s)

Final file: system track 242.42s, mic track 248.30s, delta 5.88s.

Root cause (suspected)

macOS output routing stall (not our bug) + reactive-only gap fill: pipeline only advances nextExpected when a real sample arrives with a later PTS. When no sample ever arrives, the stuck-silence timeline is only discovered at stop() finalize, too late for the writer to absorb 14+ seconds of silence in one burst.

Proposed fix (v0.7.1)

Proactive silence tracking during live capture:

  1. Heartbeat task on audioQueue (Task.sleep based), cadence ~500ms, running while recording is active.
  2. Each tick: compute now - lastSystemHostTime and now - lastMicHostTime (state already exists in AudioRecorder).
  3. If either exceeds a threshold (~1s), call a new RecordingPipeline.fillSilenceUpTo(track:, hostTime:) that appends silence up to now - grace and advances that track's nextExpected.
  4. When real samples resume, existing stale-sample drop logic (PTS < nextExpected) handles the race cleanly.
  5. padTailIfNeeded at stop becomes nearly a no-op because the timeline is already current.

Scope and constraints

  • Correctness-critical state: interacts with session start (D8 .waitForAllTracks), leading-silence fill, tail padding, stale-sample drop.
  • Plan document first. Write rationale for deviating from reactive-only fill; agree on heartbeat cadence, stall threshold, grace period, interaction with session anchoring, before implementation.
  • Target: unit test for fillSilenceUpTo, integration test simulating system-track stall, assert tail residual < 100ms.

Out of scope

  • The macOS output routing stall itself (rare, not our bug).
  • Changing D8 semantics for leading silence during live capture.

Target release: v0.7.1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions