Skip to content

Add voice calling support#1932

Open
visigoth wants to merge 5 commits intoAsamK:masterfrom
visigoth:signal-call-tunnel
Open

Add voice calling support#1932
visigoth wants to merge 5 commits intoAsamK:masterfrom
visigoth:signal-call-tunnel

Conversation

@visigoth
Copy link

Voice Call Support via RingRTC Tunnel

Adds 1:1 voice calling to signal-cli using a Rust subprocess (signal-call-tunnel) that wraps Signal's RingRTC library for ICE negotiation, SRTP encryption, and media transport.

Architecture

Calls are handled by spawning a short-lived Rust process per call that manages the WebRTC stack. The Java side handles Signal protocol signaling (offers, answers, ICE candidates) and relays them to the tunnel over a Unix domain socket control channel. Audio flows through platform virtual audio devices (PulseAudio null sinks on Linux, BlackHole drivers on macOS), so any external process can send/receive call audio using standard audio APIs.

                    commands: startCall, acceptCall, ...
    Client  ──────────────────────────────►  signal-cli (Java)
   process  ◄──────────────────────────────     │
      │         callEvent notifications         │
      │                                    spawn per call
      │                                         │
      │                                         │ ctrl.sock
      │                                         │ (signaling relay)
      │                                         │
      │                              signal-call-tunnel (Rust/RingRTC)  ◄──Signal WebRTC──►  Remote phone
      │                                         ▲
      │                                         │ audio I/O
      │                                         ▼
      │   ┌───────────────────────────────────────────────────────────┐
      └──►│      virtual audio devices (PulseAudio / BlackHole)       │
          └───────────────────────────────────────────────────────────┘

What's included

signal-call-tunnel (Rust binary)

  • Wraps RingRTC's native API for call state management, ICE, and SRTP
  • Authenticated JSON-lines control socket for signaling relay
  • Virtual audio device integration via VirtualAudioDevicePair
  • Supports --host-audio flag for direct system audio or socket audio mode

Call signaling state machine (Java)

  • CallManager orchestrates call lifecycle: subprocess spawning, control socket communication, TURN/STUN server provisioning, ring timeouts (60s), and cleanup. TURN/STUN credentials are fetched from Signal's calling API on the Java side and passed to the tunnel via the proceed control message, where RingRTC uses them for ICE negotiation.
  • CallSignalingHelper for x25519 key exchange and HKDF-based SRTP key derivation
  • Call message routing wired into IncomingMessageHandler for offers, answers, ICE candidates, hangups, and busy signals

JSON-RPC daemon commands

  • startCall, acceptCall, hangupCall, rejectCall, listCalls
  • Real-time callEvent notifications pushed to connected clients on every state transition (RINGING_INCOMING, RINGING_OUTGOING, CONNECTING, CONNECTED, RECONNECTING, ENDED)

E2E test harness

  • Automated test framework using an Android emulator running Signal
  • Five scenarios: outgoing call lifecycle, incoming call lifecycle, incoming rejection, ring timeout, and bidirectional audio verification
  • Audio verification via Goertzel frequency detection through the emulator's gRPC audio HAL
  • Shell orchestrator for build, daemon lifecycle, and cleanup; Python test runner with fail-fast and screen recording support

Documentation

  • docs/CALL_TUNNEL.md: architecture, control protocol reference, call flows, virtual audio setup, encryption details, and implementation notes
  • docs/TEST_HARNESS.md: test harness architecture, component descriptions, design decisions, and usage
  • voice-test/README.md: describes the tests being performed as well as test harness prerequisites and usage

RingRTC build-time patch (macOS)

On macOS, RingRTC's cubeb audio streams default to StreamPrefs::VOICE, which activates CoreAudio's VoiceProcessingIO (VPIO) aggregate device for echo cancellation and noise suppression. VPIO hangs when the input or output device is a BlackHole virtual driver because it tries to create an aggregate device that BlackHole doesn't support.

The patch (signal-call-tunnel/patches/ringrtc-disable-vpio.patch) adds a stream_prefs() helper to RingRTC's audio_device_module.rs that checks the RINGRTC_NO_VOICE_PROCESSING environment variable. When set, both the playout and recording streams use StreamPrefs::NONE instead, bypassing VPIO entirely. Voice processing is unnecessary for virtual audio anyway. The tunnel binary sets this variable at startup before any audio subsystem initialization.

The patch is applied automatically during cargo build via signal-call-tunnel/build.rs, which runs git apply against the third-party/ringrtc submodule. It is idempotent: build.rs checks whether the marker (RINGRTC_NO_VOICE_PROCESSING) is already present before applying, and cargo re-runs the script when either the patch file or the target source file changes.

Platforms

  • Linux: PulseAudio virtual devices created/destroyed automatically per call
  • macOS: Requires one-time BlackHole driver installation (sudo bin/virtual_audio.sh --setup)

@visigoth
Copy link
Author

@AsamK any interest in reviewing this PR?

@AsamK
Copy link
Owner

AsamK commented Feb 28, 2026

I'm interested, but it's a large PR, I currently don't have the capacity to review it fully.
Could you maybe split it up and put the python/rust parts into a separate repo? So I only need to review and then maintain the parts that affect signal-cli directly?

@visigoth
Copy link
Author

visigoth commented Mar 2, 2026

i had a feeling that's where you'd want to go with this. i don't mind, but will take me some time.

but, really, only the first commit goes away. the others all affect signal-cli. the call tunnel documentation can be split up to just document each half, so there's that at least.

@visigoth visigoth force-pushed the signal-call-tunnel branch from df69b02 to ee99967 Compare March 5, 2026 22:34
Define call method interfaces in Manager, create API records (CallInfo,
CallOffer, TurnServer), and hand-coded protobuf parsers for RingRTC
signaling messages (ConnectionParametersV4, RtpDataMessage).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@visigoth visigoth force-pushed the signal-call-tunnel branch from ee99967 to 962cb6f Compare March 5, 2026 22:44
visigoth and others added 4 commits March 5, 2026 16:16
Add CallSignalingHelper for x25519 key generation and HKDF-based SRTP
key derivation. Add CallManager for tracking active calls, spawning
call tunnel subprocesses, and handling call lifecycle (offer, answer,
ICE candidates, hangup, busy). Wire call message routing in
IncomingMessageHandler and implement Manager call methods in ManagerImpl.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement CallEventListener callback pattern that fires on every call
state transition (RINGING_INCOMING, RINGING_OUTGOING, CONNECTING,
CONNECTED, ENDED). The JSON-RPC layer auto-subscribes and pushes
callEvent notifications alongside receive notifications.

Changes:
- Manager.java: Add CallEventListener interface and methods
- ManagerImpl.java: Implement add/removeCallEventListener with cleanup
- DbusManagerImpl.java: Add stub implementation (not supported over DBus)
- JsonCallEvent.java: JSON notification record for call events
- SignalJsonRpcDispatcherHandler.java: Auto-subscribe call event listeners

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add startCall, acceptCall, hangupCall, rejectCall, and listCalls
commands for the JSON-RPC daemon interface. Register commands and
update GraalVM metadata for native image support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add documentation about the architecture, protocol, and implementation of
signal-call-tunnel, the secure tunnel subprocess for voice calling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@visigoth visigoth force-pushed the signal-call-tunnel branch from 962cb6f to 9493381 Compare March 6, 2026 00:16
@visigoth
Copy link
Author

visigoth commented Mar 6, 2026

@AsamK i have split out the tunnel implementation and test harness to a separate repository (visigoth/signal-call-tunnel). i also refactored the documentation such that docs/CALL_TUNNEL.md only talks about it from signal-cli's perspective. the rest is reasonably straightforward to review.

Copy link
Owner

@AsamK AsamK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've done a first rough review.
Please take a look.

Comment on lines +38 to +42
final var callIdNumber = ns.get("call-id");
if (callIdNumber == null) {
throw new UserErrorException("No call ID given");
}
final long callId = ((Number) callIdNumber).longValue();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use instanceof to get rid of the Number cast:

Suggested change
final var callIdNumber = ns.get("call-id");
if (callIdNumber == null) {
throw new UserErrorException("No call ID given");
}
final long callId = ((Number) callIdNumber).longValue();
final var callIdParam = ns.get("call-id");
if (!(callIdParam instanceof Number callIdNumber)) {
throw new UserErrorException("No call ID given");
}
final var callId = callIdNumber.longValue();

m.rejectCall(callId);
switch (outputWriter) {
case PlainTextWriter writer -> writer.println("Call {} rejected.", callId);
case JsonWriter writer -> writer.write(new JsonResult(callId, "rejected"));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally signal-cli commands don't print or return a result for successful command executions, unless there is additional information.
For this command you can remove the output. If the command succeeds, the call was rejected.


import java.io.IOException;

public class HangupCallCommand implements JsonRpcLocalCommand {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as in RejectCallCommand:

  • you can use instanceof instead of casting.
  • Don't return an explicit response if there is no additional information

if (callIdNumber == null) {
throw new UserErrorException("No call ID given");
}
final long callId = ((Number) callIdNumber).longValue();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment, you can use instanceof instead of casting.

useJUnitPlatform()
useJUnitPlatform {
if (!project.hasProperty("includeIntegration")) {
excludeTags("integration")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed? I don't see the integration tag anywhere else.


// Send createOutgoingCall + proceed via control channel
var peerIdStr = recipientAddress.toString();
sendControlMessage(state, "{\"type\":\"createOutgoingCall\",\"callId\":" + callIdJson(callId)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always use jackson json serialization instead of manually concatenating and escaping strings, which is quite error prone.
If you're doing it for correct serialization of large callIds, can you just use BigInteger for the callId everywhere?

| |
|-- spawn (config on stdin) --------->|
| |
|<======= ctrl.sock (JSON) ==========>|
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly the additional socket is actually unnecessary.
You can just use communication via stdin/stdout with signal-call-tunnel instead.
That way you can also get rid of the additional authentication token

}

// Check relative to the signal-cli installation
var installDir = System.getProperty("signal.cli.install.dir");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this property actually exist?

}
}

// --- Incoming call message handling ---
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All handleIncoming* methods should check if there are actually call listeners before doing anything.
If the callEventListeners list is empty, they should just do nothing. (unless the signal-call-tunnel was already spawned)

c.addOnManagerRemovedHandler(this::unsubscribeReceive);
}

for (var m : c.getManagers()) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most signal-cli users will not be interested in call events, so they shouldn't be subscribed to by default.
Can you change this to only subscribe to call events when the user calls a subscribeCallEvents command? (like it's done with subscribeReceive/unsubscribeReceive below)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants