feat: Add Bedrock Nova Sonic realtime provider implementing IRealtimeClient/IRealtimeClientSession#4373
feat: Add Bedrock Nova Sonic realtime provider implementing IRealtimeClient/IRealtimeClientSession#4373tarekgh wants to merge 10 commits intoaws:developmentfrom
Conversation
a8e52e9 to
d8dfcfe
Compare
…eClientSession Implements the MEAI IRealtimeClient and IRealtimeClientSession abstractions for AWS Bedrock Nova Sonic, enabling real-time bidirectional audio conversations with tool calling support. Key features: - Full bidirectional audio streaming via Nova Sonic protocol - VAD-driven speech detection with trailing silence for end-of-speech - Tool calling with inline invocation and priority queue for tool results - Convenience constructor with proper _ownsRuntime disposal - Thread-safe session state management with semaphore synchronization Reliability: - Priority queue bypasses queued audio for time-sensitive tool results - BodyPublisher while(true) loop with proper shutdown (no spin loops) - IAsyncEnumerator disposal on all exit paths - SendToolResultInline captures fields to local variables for thread safety Tests: 56 unit tests covering session lifecycle, audio streaming, tool calling, protocol events, error handling, and edge cases.
d8dfcfe to
a7cba1f
Compare
There was a problem hiding this comment.
Pull request overview
Adds an Amazon Bedrock Nova Sonic realtime provider that implements the Microsoft.Extensions.AI realtime abstractions (IRealtimeClient / IRealtimeClientSession) to enable bidirectional audio conversations over Bedrock’s bidirectional streaming API.
Changes:
- Introduces
BedrockNovaRealtimeClient+BedrockNovaRealtimeSessionimplementations (net8-only via#if NET8_0_OR_GREATER). - Adds an
AsIRealtimeClient()extension onIAmazonBedrockRuntime. - Updates
Microsoft.Extensions.AI.Abstractionsto10.4.1and adds a new net8 test project covering client/session behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| extensions/src/AWSSDK.Extensions.Bedrock.MEAI/BedrockNovaRealtimeClient.cs | New IRealtimeClient implementation + convenience ctor for credentials/region. |
| extensions/src/AWSSDK.Extensions.Bedrock.MEAI/BedrockNovaRealtimeSession.cs | New IRealtimeClientSession implementation handling bidirectional streaming protocol, buffering, tool orchestration, and disposal. |
| extensions/src/AWSSDK.Extensions.Bedrock.MEAI/AmazonBedrockRuntimeExtensions.cs | Adds AsIRealtimeClient() extension (net8-only). |
| extensions/src/AWSSDK.Extensions.Bedrock.MEAI/AWSSDK.Extensions.Bedrock.MEAI.NetStandard.csproj | Bumps Microsoft.Extensions.AI.Abstractions to 10.4.1 and updates warning suppression. |
| extensions/test/BedrockMEAIRealtimeTests/BedrockMEAIRealtimeTests.csproj | New net8 test project for realtime functionality. |
| extensions/test/BedrockMEAIRealtimeTests/BedrockRealtimeClientTests.cs | New unit tests for realtime client construction, service resolution, model selection, and extension method. |
| extensions/test/BedrockMEAIRealtimeTests/BedrockRealtimeSessionTests.cs | New unit tests for session send behavior, protocol event formatting, and concurrency/disposal behavior. |
- Fix model ID in XML doc comments (nova-sonic-v1:0 -> nova-2-sonic-v1:0) - Remove unused using directives (System.IO, Amazon.BedrockRuntime.Model) - Add DevConfig file for release automation
Replace all 31 Task.Delay calls with SpinWait.SpinUntil-based polling helpers (WaitForEvents/WaitForEvent) for CI-resilient tests. Test duration dropped from ~6s to ~650ms.
Replace anonymous type serialization with concrete DTO classes and System.Text.Json source generation (NovaSonicJsonContext). This eliminates most IL2026 warnings without runtime reflection for protocol events. - Add ~20 concrete DTO classes for Nova Sonic protocol messages - Add NovaSonicJsonContext with [JsonSerializable] for compile-time codegen - Remove IL2026 from project-wide NoWarn (only 7 targeted pragmas remain) - Keep reflection path only for dynamic tool result serialization
normj
left a comment
There was a problem hiding this comment.
Looks like a cool feature. I took a first pass. After you make the changes I'll try it out in action.
Thanks @normj for helping with the review. Just to let you know, I have a demo test app I am using to test with different providers. https://github.com/tarekgh/RealtimeProposalDemoApp. |
- Remove default model ID; require explicit model via constructor or session options - Remove convenience constructor (accessKeyId/secretKey); follow BedrockChatClient pattern - Throw InvalidOperationException if no model ID resolves in CreateSessionAsync - Remove unnecessary dummy event handler registrations - Use Dictionary<string, JsonElement> for tool arguments (AOT-safe) - Rewrite SerializeToolResult without reflection (zero IL2026 pragmas remain) - Add (Preview) prefix to DevConfig changelog message
… tool normalization - Fix SendAsync error handling: rethrow ODE as named ObjectDisposedException, swallow ChannelClosedException/OCE only when disposed (not blanket catch) - Add concurrent enumeration guard (_activeStreamingEnumeration) to GetStreamingResponseAsync to prevent multiple simultaneous readers - Wrap DisposeAsync resources in individual try/catch with ExceptionDispatchInfo to prevent resource leaks on partial failure - Replace shallow JsonElementToDictionary with deep NormalizeToolPayload, NormalizeToolArguments, ConvertJsonElementToToolPayload for tool results - Use FunctionCallContent.CreateFromParsedArguments for tool call args (consistent with MEAI conventions, AOT-safe) - Add MaxToolPayloadDepth (64) depth guard to prevent stack overflow - Align disposed/cancellation check order with GenAI/VertexAI providers - Replace ThrowIf(this) with manual if+nameof() for consistent ODE naming - Add InternalsVisibleTo for test project access to normalization methods - Add 8 regression tests for new behaviors
|
Hi @normj, just wanted to see if you’ve had a chance to test out that provider yet? I’m looking to wrap up this review and would love to get your take. |
|
@tarekgh it is on my list to hopefully get to soon. |
1) Fix some formatting 2) Remove private method that wasn't being used 3) Add test project to solution file
There was a problem hiding this comment.
I rebased the branch on the latest development changes and pushed a commit with a couple very minor nits. PR looks good and I did try out the sample app which was cool.
@dscpinheiro Can you do a second pass on the PR?
We are also working through some infrastructure issues that get triggered by updating the version of Microsoft.Extensions.AI.Abstractions which pulls in a newer version of System.Text.Json in some of our other process. The change to update Microsoft.Extensions.AI.Abstractions is fine but there are assumptions in other parts of our build system for packaging up the SDK for non NuGet users that go awry. Bare with us even when we approve the PR it might be a bit before we can merge it till we sort out the issue.
|
Thanks @normj! Take your time. I want to ensure the changes is not going to cause any issue in general. |
Summary
Adds a Bedrock Nova Sonic provider implementing the
Microsoft.Extensions.AIRealtime abstractions (IRealtimeClient/IRealtimeClientSession), enabling real-time bidirectional audio conversations with AWS Bedrock Nova Sonic models through the standardized MEAI interface.This PR updates the
Microsoft.Extensions.AI.Abstractionsdependency from9.9.1to10.4.1for the realtime types.What's Included
New Files
BedrockNovaRealtimeClient.cs--IRealtimeClientimplementation that wraps anIAmazonBedrockRuntimeand creates realtime sessions via the Nova Sonic bidirectional streaming API. Includes a convenience constructor accepting access key, secret key, region, and optional model ID.BedrockNovaRealtimeSession.cs--IRealtimeClientSessionimplementation that manages the bidirectional event stream, audio buffering, Nova Sonic protocol state machine, and function call orchestration (~1,540 lines).BedrockRealtimeClientTests.cs-- Client-level unit tests (construction, disposal, metadata, service resolution).BedrockRealtimeSessionTests.cs-- Session-level unit tests covering the full protocol surface area.Modified Files
AmazonBedrockRuntimeExtensions.cs-- AddedAsIRealtimeClient()extension method.AWSSDK.Extensions.Bedrock.MEAI.NetStandard.csproj-- UpdatedMicrosoft.Extensions.AI.Abstractionsdependency from9.9.1to10.4.1.Features
InvokeModelWithBidirectionalStreamAsyncwithBodyPublisherpatternCreateConversationItemwith automatic content block managementtextOutputevents toInputAudioTranscriptionCompletedandOutputAudioTranscriptionDeltamessagesSemaphoreSlimserializes all outbound writes; priority queue bypasses normal channel for time-sensitive tool resultsBedrockNovaRealtimeClient(string accessKeyId, string secretAccessKey, string regionName, string? defaultModelId)for simple setupAmazonBedrockRuntimeClient[Experimental]attributeUsage Example
Key Design Decisions
BodyPublisher while(true) loop -- Nova Sonic uses a
BodyPublisherdelegate that returns events one at a time. The provider uses a persistent loop that waits on both a normalChannel<T>for audio events and aConcurrentQueuefor priority tool results. This avoids closing the outbound stream prematurely while ensuring tool results bypass queued audio.Priority queue for tool results -- Tool results are sent via
WritePriorityEventwhich enqueues to aConcurrentQueueand signals the BodyPublisher viaTaskCompletionSource. This ensures tool results reach Nova Sonic before it commits to a speculative fallback response, which is critical for reliable function calling.Inline tool invocation -- When
FunctionInvokingRealtimeSessionmiddleware sendsCreateConversationItemwith tool results, the provider sends them inline via the priority queue with proper Nova Sonic protocol framing (contentStart -> toolResult -> contentEnd -> silence nudge).Audio content stays open -- Following the official AWS Nova Sonic samples, the audio content block is never explicitly closed mid-conversation. Instead, trailing silence is sent on
InputAudioBufferCommitto help VAD detect end-of-speech. This matches how the Nova Sonic service expects continuous audio streams.Prompt lifecycle management -- The provider automatically manages promptStart/promptEnd lifecycle. After a tool result cycle, a new prompt is opened with a fresh
promptNamefor the next turn, maintaining correct protocol state.BodyPublisher shutdown safety -- The while(true) loop checks for cancellation token and channel completion to avoid CPU-burning spin loops during disposal. All exit paths dispose the enumerator.
Thread safety -- All outbound writes go through a
SemaphoreSlim. Fields mutated under the semaphore (_promptName,_audioContentName) are captured to local variables when read outside the semaphore (e.g., inSendToolResultInline).Test Coverage
54 unit tests covering: