Skip to content

Commit c335f19

Browse files
feat: [texttospeech] StreamingSynthesize now supports opus (#5887)
* feat: StreamingSynthesize now supports opus PiperOrigin-RevId: 707168599 Source-Link: googleapis/googleapis@d985436 Source-Link: googleapis/googleapis-gen@3fcc3af Copy-Tag: eyJwIjoicGFja2FnZXMvZ29vZ2xlLWNsb3VkLXRleHR0b3NwZWVjaC8uT3dsQm90LnlhbWwiLCJoIjoiM2ZjYzNhZmJmOGM5MjA4NGNjNGEzMDIzMmE3NmNhMjQ3NDg5YzNkMCJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md --------- Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
1 parent 7e82cf2 commit c335f19

File tree

4 files changed

+471
-3
lines changed

4 files changed

+471
-3
lines changed

packages/google-cloud-texttospeech/protos/google/cloud/texttospeech/v1beta1/cloud_tts.proto

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,11 @@ enum AudioEncoding {
115115
// 8-bit samples that compand 14-bit audio samples using G.711 PCMU/A-law.
116116
// Audio content returned as ALAW also contains a WAV header.
117117
ALAW = 6;
118+
119+
// Uncompressed 16-bit signed little-endian samples (Linear PCM).
120+
// Note that as opposed to LINEAR16, audio will not be wrapped in a WAV (or
121+
// any other) header.
122+
PCM = 7;
118123
}
119124

120125
// The top-level message sent by the client for the `ListVoices` method.
@@ -432,10 +437,25 @@ message Timepoint {
432437
double time_seconds = 3;
433438
}
434439

440+
// Description of the desired output audio data.
441+
message StreamingAudioConfig {
442+
// Required. The format of the audio byte stream.
443+
// For now, streaming only supports PCM and OGG_OPUS. All other encodings
444+
// will return an error.
445+
AudioEncoding audio_encoding = 1 [(google.api.field_behavior) = REQUIRED];
446+
447+
// Optional. The synthesis sample rate (in hertz) for this audio.
448+
int32 sample_rate_hertz = 2 [(google.api.field_behavior) = OPTIONAL];
449+
}
450+
435451
// Provides configuration information for the StreamingSynthesize request.
436452
message StreamingSynthesizeConfig {
437453
// Required. The desired voice of the synthesized audio.
438454
VoiceSelectionParams voice = 1 [(google.api.field_behavior) = REQUIRED];
455+
456+
// Optional. The configuration of the synthesized audio.
457+
StreamingAudioConfig streaming_audio_config = 4
458+
[(google.api.field_behavior) = OPTIONAL];
439459
}
440460

441461
// Input to be synthesized.

packages/google-cloud-texttospeech/protos/protos.d.ts

Lines changed: 111 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)