Skip to content

Commit 99d5324

Browse files
feat: add model and language_codes fields in RecognitionConfig message + enable default _ recognizer (#4395)
* feat: add `model` and `language_codes` fields in `RecognitionConfig` message + enable default `_` recognizer Enables specifying `model` and `language_codes` in requests without having to specify them in the Recognizer (they can still be specified in the Recognizer in the `default_recognition_config` field). Also enables using the recognizer ID `_` to perform recognition without explicitly creating a Recognizer resource. The top-level `model` and `language_codes` fields are deprecated in favor of the new fields added in the `RecognitionConfig` message. The old fields continue to work. PiperOrigin-RevId: 545698919 Source-Link: googleapis/googleapis@e73fc8f Source-Link: googleapis/googleapis-gen@b77dfdf Copy-Tag: eyJwIjoicGFja2FnZXMvZ29vZ2xlLWNsb3VkLXNwZWVjaC8uT3dsQm90LnlhbWwiLCJoIjoiYjc3ZGZkZmUzOTI3ZTQwOTg3NWIyZDg5MTNmMjU3NGZhMDBhMDVhNSJ9 * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md --------- Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com> Co-authored-by: Denis DelGrosso <85250797+ddelgrosso1@users.noreply.github.com>
1 parent 124da6b commit 99d5324

File tree

9 files changed

+166
-35
lines changed

9 files changed

+166
-35
lines changed

packages/google-cloud-speech/protos/google/cloud/speech/v2/cloud_speech.proto

Lines changed: 53 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -585,7 +585,7 @@ message Recognizer {
585585
// characters or less.
586586
string display_name = 3;
587587

588-
// Required. Which model to use for recognition requests. Select the model
588+
// Optional. Which model to use for recognition requests. Select the model
589589
// best suited to your domain to get best results.
590590
//
591591
// Guidance for choosing which model to use can be found in the [Transcription
@@ -594,9 +594,9 @@ message Recognizer {
594594
// and the models supported in each region can be found in the [Table Of
595595
// Supported
596596
// Models](https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages).
597-
string model = 4 [(google.api.field_behavior) = REQUIRED];
597+
string model = 4 [deprecated = true, (google.api.field_behavior) = OPTIONAL];
598598

599-
// Required. The language of the supplied audio as a
599+
// Optional. The language of the supplied audio as a
600600
// [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag.
601601
//
602602
// Supported languages for each model are listed in the [Table of Supported
@@ -608,7 +608,8 @@ message Recognizer {
608608
// When you create or update a Recognizer, these values are
609609
// stored in normalized BCP-47 form. For example, "en-us" is stored as
610610
// "en-US".
611-
repeated string language_codes = 17 [(google.api.field_behavior) = REQUIRED];
611+
repeated string language_codes = 17
612+
[deprecated = true, (google.api.field_behavior) = OPTIONAL];
612613

613614
// Default configuration to use for requests with this Recognizer.
614615
// This can be overwritten by inline configuration in the
@@ -867,6 +868,30 @@ message RecognitionConfig {
867868
ExplicitDecodingConfig explicit_decoding_config = 8;
868869
}
869870

871+
// Optional. Which model to use for recognition requests. Select the model
872+
// best suited to your domain to get best results.
873+
//
874+
// Guidance for choosing which model to use can be found in the [Transcription
875+
// Models
876+
// Documentation](https://cloud.google.com/speech-to-text/v2/docs/transcription-model)
877+
// and the models supported in each region can be found in the [Table Of
878+
// Supported
879+
// Models](https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages).
880+
string model = 9 [(google.api.field_behavior) = OPTIONAL];
881+
882+
// Optional. The language of the supplied audio as a
883+
// [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag.
884+
// Language tags are normalized to BCP-47 before they are used eg "en-us"
885+
// becomes "en-US".
886+
//
887+
// Supported languages for each model are listed in the [Table of Supported
888+
// Models](https://cloud.google.com/speech-to-text/v2/docs/speech-to-text-supported-languages).
889+
//
890+
// If additional languages are provided, recognition result will contain
891+
// recognition in the most likely language detected. The recognition result
892+
// will include the language tag of the language detected in the audio.
893+
repeated string language_codes = 10 [(google.api.field_behavior) = OPTIONAL];
894+
870895
// Speech recognition features to enable.
871896
RecognitionFeatures features = 2;
872897

@@ -883,7 +908,8 @@ message RecognitionConfig {
883908
message RecognizeRequest {
884909
// Required. The name of the Recognizer to use during recognition. The
885910
// expected format is
886-
// `projects/{project}/locations/{location}/recognizers/{recognizer}`.
911+
// `projects/{project}/locations/{location}/recognizers/{recognizer}`. The
912+
// {recognizer} segment may be set to `_` to use an empty implicit Recognizer.
887913
string recognizer = 3 [
888914
(google.api.field_behavior) = REQUIRED,
889915
(google.api.resource_reference) = {
@@ -1100,24 +1126,27 @@ message StreamingRecognitionConfig {
11001126
// [StreamingRecognize][google.cloud.speech.v2.Speech.StreamingRecognize]
11011127
// method. Multiple
11021128
// [StreamingRecognizeRequest][google.cloud.speech.v2.StreamingRecognizeRequest]
1103-
// messages are sent. The first message must contain a
1129+
// messages are sent in one call.
1130+
//
1131+
// If the [Recognizer][google.cloud.speech.v2.Recognizer] referenced by
1132+
// [recognizer][google.cloud.speech.v2.StreamingRecognizeRequest.recognizer]
1133+
// contains a fully specified request configuration then the stream may only
1134+
// contain messages with only
1135+
// [audio][google.cloud.speech.v2.StreamingRecognizeRequest.audio] set.
1136+
//
1137+
// Otherwise the first message must contain a
11041138
// [recognizer][google.cloud.speech.v2.StreamingRecognizeRequest.recognizer] and
1105-
// optionally a
1139+
// a
11061140
// [streaming_config][google.cloud.speech.v2.StreamingRecognizeRequest.streaming_config]
1107-
// message and must not contain
1108-
// [audio][google.cloud.speech.v2.StreamingRecognizeRequest.audio]. All
1109-
// subsequent messages must contain
1110-
// [audio][google.cloud.speech.v2.StreamingRecognizeRequest.audio] and must not
1111-
// contain a
1112-
// [streaming_config][google.cloud.speech.v2.StreamingRecognizeRequest.streaming_config]
1113-
// message.
1141+
// message that together fully specify the request configuration and must not
1142+
// contain [audio][google.cloud.speech.v2.StreamingRecognizeRequest.audio]. All
1143+
// subsequent messages must only have
1144+
// [audio][google.cloud.speech.v2.StreamingRecognizeRequest.audio] set.
11141145
message StreamingRecognizeRequest {
1115-
// Required. Streaming recognition should start with an initial request having
1116-
// a `recognizer`. Subsequent requests carry the audio data to be recognized.
1117-
//
1118-
// The initial request with configuration can be omitted if the Recognizer
1119-
// being used has a
1120-
// [default_recognition_config][google.cloud.speech.v2.Recognizer.default_recognition_config].
1146+
// Required. The name of the Recognizer to use during recognition. The
1147+
// expected format is
1148+
// `projects/{project}/locations/{location}/recognizers/{recognizer}`. The
1149+
// {recognizer} segment may be set to `_` to use an empty implicit Recognizer.
11211150
string recognizer = 3 [
11221151
(google.api.field_behavior) = REQUIRED,
11231152
(google.api.resource_reference) = {
@@ -1152,7 +1181,10 @@ message BatchRecognizeRequest {
11521181
DYNAMIC_BATCHING = 1;
11531182
}
11541183

1155-
// Required. Resource name of the recognizer to be used for ASR.
1184+
// Required. The name of the Recognizer to use during recognition. The
1185+
// expected format is
1186+
// `projects/{project}/locations/{location}/recognizers/{recognizer}`. The
1187+
// {recognizer} segment may be set to `_` to use an empty implicit Recognizer.
11561188
string recognizer = 1 [
11571189
(google.api.field_behavior) = REQUIRED,
11581190
(google.api.resource_reference) = {

packages/google-cloud-speech/protos/protos.d.ts

Lines changed: 12 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/google-cloud-speech/protos/protos.js

Lines changed: 63 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/google-cloud-speech/protos/protos.json

Lines changed: 19 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/google-cloud-speech/samples/generated/v2/snippet_metadata.google.cloud.speech.v2.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@
314314
"segments": [
315315
{
316316
"start": 25,
317-
"end": 98,
317+
"end": 99,
318318
"type": "FULL"
319319
}
320320
],
@@ -370,7 +370,7 @@
370370
"segments": [
371371
{
372372
"start": 25,
373-
"end": 72,
373+
"end": 71,
374374
"type": "FULL"
375375
}
376376
],
@@ -418,7 +418,7 @@
418418
"segments": [
419419
{
420420
"start": 25,
421-
"end": 92,
421+
"end": 95,
422422
"type": "FULL"
423423
}
424424
],

packages/google-cloud-speech/samples/generated/v2/speech.batch_recognize.js

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,10 @@ function main(recognizer) {
2929
* TODO(developer): Uncomment these variables before running the sample.
3030
*/
3131
/**
32-
* Required. Resource name of the recognizer to be used for ASR.
32+
* Required. The name of the Recognizer to use during recognition. The
33+
* expected format is
34+
* `projects/{project}/locations/{location}/recognizers/{recognizer}`. The
35+
* {recognizer} segment may be set to `_` to use an empty implicit Recognizer.
3336
*/
3437
// const recognizer = 'abc123'
3538
/**

packages/google-cloud-speech/samples/generated/v2/speech.recognize.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ function main(recognizer) {
3131
/**
3232
* Required. The name of the Recognizer to use during recognition. The
3333
* expected format is
34-
* `projects/{project}/locations/{location}/recognizers/{recognizer}`.
34+
* `projects/{project}/locations/{location}/recognizers/{recognizer}`. The
35+
* {recognizer} segment may be set to `_` to use an empty implicit Recognizer.
3536
*/
3637
// const recognizer = 'abc123'
3738
/**

0 commit comments

Comments
 (0)