Class: IBMWatson::TextToSpeechV1
- Inherits:
-
IBMCloudSdkCore::BaseService
- Object
- IBMCloudSdkCore::BaseService
- IBMWatson::TextToSpeechV1
- Includes:
- Concurrent::Async
- Defined in:
- lib/ibm_watson/text_to_speech_v1.rb
Overview
The Text to Speech V1 service.
Constant Summary collapse
- DEFAULT_SERVICE_NAME =
"text_to_speech"- DEFAULT_SERVICE_URL =
"https://api.us-south.text-to-speech.watson.cloud.ibm.com"
Instance Method Summary collapse
-
#add_custom_prompt(customization_id: , prompt_id: , metadata: , file: ) ⇒ IBMCloudSdkCore::DetailedResponse
Add a custom prompt.
-
#add_word(customization_id: , word: , translation: , part_of_speech: nil) ⇒ nil
Add a custom word.
-
#add_words(customization_id: , words: ) ⇒ nil
Add custom words.
-
#create_custom_model(name: , language: nil, description: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Create a custom model.
-
#create_speaker_model(speaker_name: , audio: ) ⇒ IBMCloudSdkCore::DetailedResponse
Create a speaker model.
-
#delete_custom_model(customization_id: ) ⇒ nil
Delete a custom model.
-
#delete_custom_prompt(customization_id: , prompt_id: ) ⇒ nil
Delete a custom prompt.
-
#delete_speaker_model(speaker_id: ) ⇒ nil
Delete a speaker model.
-
#delete_user_data(customer_id: ) ⇒ nil
Delete labeled data.
-
#delete_word(customization_id: , word: ) ⇒ nil
Delete a custom word.
-
#get_custom_model(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a custom model.
-
#get_custom_prompt(customization_id: , prompt_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a custom prompt.
-
#get_pronunciation(text: , voice: nil, format: nil, customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Get pronunciation.
-
#get_speaker_model(speaker_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a speaker model.
-
#get_voice(voice: , customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Get a voice.
-
#get_word(customization_id: , word: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a custom word.
-
#initialize(args) ⇒ TextToSpeechV1
constructor
Construct a new client for the Text to Speech service.
-
#list_custom_models(language: nil) ⇒ IBMCloudSdkCore::DetailedResponse
List custom models.
-
#list_custom_prompts(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
List custom prompts.
-
#list_speaker_models ⇒ IBMCloudSdkCore::DetailedResponse
List speaker models.
-
#list_voices ⇒ IBMCloudSdkCore::DetailedResponse
List voices.
-
#list_words(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
List custom words.
-
#synthesize(text: , accept: nil, voice: nil, customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Synthesize audio.
-
#update_custom_model(customization_id: , name: nil, description: nil, words: nil) ⇒ nil
Update a custom model.
Constructor Details
#initialize(args) ⇒ TextToSpeechV1
Construct a new client for the Text to Speech service.
66 67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 66 def initialize(args = {}) @__async_initialized__ = false defaults = {} defaults[:service_url] = DEFAULT_SERVICE_URL defaults[:service_name] = DEFAULT_SERVICE_NAME defaults[:authenticator] = nil user_service_url = args[:service_url] unless args[:service_url].nil? args = defaults.merge(args) args[:authenticator] = IBMCloudSdkCore::ConfigBasedAuthenticatorFactory.new.get_authenticator(service_name: args[:service_name]) if args[:authenticator].nil? super @service_url = user_service_url unless user_service_url.nil? end |
Instance Method Details
#add_custom_prompt(customization_id: , prompt_id: , metadata: , file: ) ⇒ IBMCloudSdkCore::DetailedResponse
Add a custom prompt. Adds a custom prompt to a custom model. A prompt is defined by the text that is to
be spoken, the audio for that text, a unique user-specified ID for the prompt, and
an optional speaker ID. The information is used to generate prosodic data that is
not visible to the user. This data is used by the service to produce the
synthesized audio upon request. You must use credentials for the instance of the
service that owns a custom model to add a prompt to it. You can add a maximum of
1000 custom prompts to a single custom model.
You are recommended to assign meaningful values for prompt IDs. For example, use
`goodbye` to identify a prompt that speaks a farewell . Prompt IDs must be
unique within a given custom model. You cannot define two prompts with the same
name for the same custom model. If you provide the ID of an existing prompt, the
previously uploaded prompt is replaced by the new information. The existing prompt
is reprocessed by using the new text and audio and, if provided, new speaker
model, and the prosody data associated with the prompt is updated.
The quality of a prompt is undefined if the language of a prompt does not match
the language of its custom model. This is consistent with any text or SSML that is
specified for a speech synthesis request. The service makes a best-effort attempt
to render the specified text for the prompt; it does not validate that the
language of the text matches the language of the model.
Adding a prompt is an asynchronous operation. Although it accepts less audio than
speaker enrollment, the service must align the audio with the provided text. The
time that it takes to process a prompt depends on the prompt itself. The
processing time for a reasonably sized prompt generally matches the length of the
audio (for example, it takes 20 seconds to process a 20-second prompt).
For shorter prompts, you can wait for a reasonable amount of time and then check
the status of the prompt with the [Get a custom prompt](#getcustomprompt) method.
For longer prompts, consider using that method to poll the service every few
seconds to determine when the prompt becomes available. No prompt can be used for
speech synthesis if it is in the `processing` or `failed` state. Only prompts that
are in the `available` state can be used for speech synthesis.
When it processes a request, the service attempts to align the text and the audio
that are provided for the prompt. The text that is passed with a prompt must match
the spoken audio as closely as possible. Optimally, the text and audio match
exactly. The service does its best to align the specified text with the audio, and
it can often compensate for mismatches between the two. But if the service cannot
effectively align the text and the audio, possibly because the magnitude of
mismatches between the two is too great, processing of the prompt fails.
### Evaluating a prompt
Always listen to and evaluate a prompt to determine its quality before using it
in production. To evaluate a prompt, include only the single prompt in a speech
synthesis request by using the following SSML extension, in this case for a prompt
whose ID is `goodbye`:
`<ibm:prompt id="goodbye"/>`
In some cases, you might need to rerecord and resubmit a prompt as many as five
times to address the following possible problems:
* The service might fail to detect a mismatch between the prompts text and audio.
The longer the prompt, the greater the chance for misalignment between its text
and audio. Therefore, multiple shorter prompts are preferable to a single long
prompt.
* The text of a prompt might include a word that the service does not recognize.
In this case, you can create a custom word and pronunciation pair to tell the
service how to pronounce the word. You must then re-create the prompt.
* The quality of the input audio might be insufficient or the services processing
of the audio might fail to detect the intended prosody. Submitting new audio for
the prompt can correct these issues.
If a prompt that is created without a speaker ID does not adequately reflect the
intended prosody, enrolling the speaker and providing a speaker ID for the prompt
is one recommended means of potentially improving the quality of the prompt. This
is especially important for shorter prompts such as "good-bye" or "thank you,"
where less audio data makes it more difficult to match the prosody of the speaker.
Custom prompts are supported only for use with US English custom models and
voices.
**See also:**
* [Add a custom
prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-add-prompt)
* [Evaluate a custom
prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-evaluate-prompt)
* [Rules for creating custom
prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-prompts).
966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 966 def add_custom_prompt(customization_id:, prompt_id:, metadata:, file:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil? raise ArgumentError.new("metadata must be provided") if .nil? raise ArgumentError.new("file must be provided") if file.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_custom_prompt") headers.merge!(sdk_headers) form_data = {} form_data[:metadata] = HTTP::FormData::Part.new(.to_s, content_type: "application/json") unless file.instance_of?(StringIO) || file.instance_of?(File) file = file.respond_to?(:to_json) ? StringIO.new(file.to_json) : StringIO.new(file) end form_data[:file] = HTTP::FormData::File.new(file, content_type: "audio/wav", filename: file.respond_to?(:path) ? file.path : nil) method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)] response = request( method: "POST", url: method_url, headers: headers, form: form_data, accept_json: true ) response end |
#add_word(customization_id: , word: , translation: , part_of_speech: nil) ⇒ nil
Add a custom word. Adds a single word and its translation to the specified custom model. Adding a new
translation for a word that already exists in a custom model overwrites the word's
existing translation. A custom model can contain no more than 20,000 entries. You
must use credentials for the instance of the service that owns a model to add a
word to it.
You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation
<code><phoneme alphabet="ipa"
ph="təmˈɑto"></phoneme></code>
or in the proprietary IBM Symbolic Phonetic Representation (SPR)
<code><phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"></phoneme></code>
**See also:**
* [Adding a single word to a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordAdd)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 724 def add_word(customization_id:, word:, translation:, part_of_speech: nil) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("word must be provided") if word.nil? raise ArgumentError.new("translation must be provided") if translation.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_word") headers.merge!(sdk_headers) data = { "translation" => translation, "part_of_speech" => part_of_speech } method_url = "/v1/customizations/%s/words/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(word)] request( method: "PUT", url: method_url, headers: headers, json: data, accept_json: false ) nil end |
#add_words(customization_id: , words: ) ⇒ nil
Add custom words. Adds one or more words and their translations to the specified custom model.
Adding a new translation for a word that already exists in a custom model
overwrites the word's existing translation. A custom model can contain no more
than 20,000 entries. You must use credentials for the instance of the service that
owns a model to add words to it.
You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation
<code><phoneme alphabet="ipa"
ph="təmˈɑto"></phoneme></code>
or in the proprietary IBM Symbolic Phonetic Representation (SPR)
<code><phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"></phoneme></code>
**See also:**
* [Adding multiple words to a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsAdd)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 622 def add_words(customization_id:, words:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("words must be provided") if words.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_words") headers.merge!(sdk_headers) data = { "words" => words } method_url = "/v1/customizations/%s/words" % [ERB::Util.url_encode(customization_id)] request( method: "POST", url: method_url, headers: headers, json: data, accept_json: true ) nil end |
#create_custom_model(name: , language: nil, description: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Create a custom model. Creates a new empty custom model. You must specify a name for the new custom
model. You can optionally specify the language and a description for the new
model. The model is owned by the instance of the service whose credentials are
used to create it.
**See also:** [Creating a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsCreate).
**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR` language
identifier cannot be used to create a custom model; use the `ar-MS` identifier
instead.
386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 386 def create_custom_model(name:, language: nil, description: nil) raise ArgumentError.new("name must be provided") if name.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "create_custom_model") headers.merge!(sdk_headers) data = { "name" => name, "language" => language, "description" => description } method_url = "/v1/customizations" response = request( method: "POST", url: method_url, headers: headers, json: data, accept_json: true ) response end |
#create_speaker_model(speaker_name: , audio: ) ⇒ IBMCloudSdkCore::DetailedResponse
Create a speaker model. Creates a new speaker model, which is an optional enrollment token for users who
are to add prompts to custom models. A speaker model contains information about a
user's voice. The service extracts this information from a WAV audio sample that
you pass as the body of the request. Associating a speaker model with a prompt is
optional, but the information that is extracted from the speaker model helps the
service learn about the speaker's voice.
A speaker model can make an appreciable difference in the quality of prompts,
especially short prompts with relatively little audio, that are associated with
that speaker. A speaker model can help the service produce a prompt with more
confidence; the lack of a speaker model can potentially compromise the quality of
a prompt.
The gender of the speaker who creates a speaker model does not need to match the
gender of a voice that is used with prompts that are associated with that speaker
model. For example, a speaker model that is created by a male speaker can be
associated with prompts that are spoken by female voices.
You create a speaker model for a given instance of the service. The new speaker
model is owned by the service instance whose credentials are used to create it.
That same speaker can then be used to create prompts for all custom models within
that service instance. No language is associated with a speaker model, but each
custom model has a single specified language. You can add prompts only to US
English models.
You specify a name for the speaker when you create it. The name must be unique
among all speaker names for the owning service instance. To re-create a speaker
model for an existing speaker name, you must first delete the existing speaker
model that has that name.
Speaker enrollment is a synchronous operation. Although it accepts more audio data
than a prompt, the process of adding a speaker is very fast. The service simply
extracts information about the speakers voice from the audio. Unlike prompts,
speaker models neither need nor accept a transcription of the audio. When the call
returns, the audio is fully processed and the speaker enrollment is complete.
The service returns a speaker ID with the request. A speaker ID is globally unique
identifier (GUID) that you use to identify the speaker in subsequent requests to
the service. Speaker models and the custom prompts with which they are used are
supported only for use with US English custom models and voices.
**See also:**
* [Create a speaker
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-speaker-model)
* [Rules for creating speaker
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-speakers).
1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1171 def create_speaker_model(speaker_name:, audio:) raise ArgumentError.new("speaker_name must be provided") if speaker_name.nil? raise ArgumentError.new("audio must be provided") if audio.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "create_speaker_model") headers.merge!(sdk_headers) params = { "speaker_name" => speaker_name } data = audio headers["Content-Type"] = "audio/wav" method_url = "/v1/speakers" response = request( method: "POST", url: method_url, headers: headers, params: params, data: data, accept_json: true ) response end |
#delete_custom_model(customization_id: ) ⇒ nil
Delete a custom model. Deletes the specified custom model. You must use credentials for the instance of
the service that owns a model to delete it.
**See also:** [Deleting a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsDelete).
559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 559 def delete_custom_model(customization_id:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_custom_model") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s" % [ERB::Util.url_encode(customization_id)] request( method: "DELETE", url: method_url, headers: headers, accept_json: false ) nil end |
#delete_custom_prompt(customization_id: , prompt_id: ) ⇒ nil
Delete a custom prompt. Deletes an existing custom prompt from a custom model. The service deletes the
prompt with the specified ID. You must use credentials for the instance of the
service that owns the custom model from which the prompt is to be deleted.
**Caution:** Deleting a custom prompt elicits a 400 response code from synthesis
requests that attempt to use the prompt. Make sure that you do not attempt to use
a deleted prompt in a production application. Custom prompts are supported only
for use with US English custom models and voices.
**See also:** [Deleting a custom
prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-delete).
1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1055 def delete_custom_prompt(customization_id:, prompt_id:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_custom_prompt") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)] request( method: "DELETE", url: method_url, headers: headers, accept_json: false ) nil end |
#delete_speaker_model(speaker_id: ) ⇒ nil
Delete a speaker model. Deletes an existing speaker model from the service instance. The service deletes
the enrolled speaker with the specified speaker ID. You must use credentials for
the instance of the service that owns a speaker model to delete the speaker.
Any prompts that are associated with the deleted speaker are not affected by the
speaker's deletion. The prosodic data that defines the quality of a prompt is
established when the prompt is created. A prompt is static and remains unaffected
by deletion of its associated speaker. However, the prompt cannot be resubmitted
or updated with its original speaker once that speaker is deleted. Speaker models
and the custom prompts with which they are used are supported only for use with US
English custom models and voices.
**See also:** [Deleting a speaker
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-delete).
1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1257 def delete_speaker_model(speaker_id:) raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_speaker_model") headers.merge!(sdk_headers) method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)] request( method: "DELETE", url: method_url, headers: headers, accept_json: false ) nil end |
#delete_user_data(customer_id: ) ⇒ nil
Delete labeled data. Deletes all data that is associated with a specified customer ID. The method
deletes all data for the customer ID, regardless of the method by which the
information was added. The method has no effect if no data is associated with the
customer ID. You must issue the request with credentials for the same instance of
the service that was used to associate the customer ID with the data. You
associate a customer ID with data by passing the `X-Watson-Metadata` header with a
request that passes the data.
**Note:** If you delete an instance of the service from the service console, all
data associated with that service instance is automatically deleted. This includes
all custom models and word/translation pairs, and all data to speech
synthesis requests.
**See also:** [Information
security](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-information-security#information-security).
1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1299 def delete_user_data(customer_id:) raise ArgumentError.new("customer_id must be provided") if customer_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_user_data") headers.merge!(sdk_headers) params = { "customer_id" => customer_id } method_url = "/v1/user_data" request( method: "DELETE", url: method_url, headers: headers, params: params, accept_json: false ) nil end |
#delete_word(customization_id: , word: ) ⇒ nil
Delete a custom word. Deletes a single word from the specified custom model. You must use credentials
for the instance of the service that owns a model to delete its words.
**See also:** [Deleting a word from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordDelete).
799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 799 def delete_word(customization_id:, word:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("word must be provided") if word.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_word") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s/words/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(word)] request( method: "DELETE", url: method_url, headers: headers, accept_json: false ) nil end |
#get_custom_model(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a custom model. Gets all information about a specified custom model. In addition to metadata such
as the name and description of the custom model, the output includes the words and
their translations that are defined for the model, as well as any prompts that are
defined for the model. To see just the for a model, use the [List custom
models](#listcustommodels) method.
**See also:** [ a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQuery).
529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 529 def get_custom_model(customization_id:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_custom_model") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s" % [ERB::Util.url_encode(customization_id)] response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#get_custom_prompt(customization_id: , prompt_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a custom prompt. Gets information about a specified custom prompt for a specified custom model. The
information includes the prompt ID, prompt text, status, and optional speaker ID
for each prompt of the custom model. You must use credentials for the instance of
the service that owns the custom model. Custom prompts are supported only for use
with US English custom models and voices.
**See also:** [Listing custom
prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).
1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1016 def get_custom_prompt(customization_id:, prompt_id:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_custom_prompt") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)] response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#get_pronunciation(text: , voice: nil, format: nil, customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Get pronunciation. Gets the phonetic pronunciation for the specified word. You can request the
pronunciation for a specific format. You can also request the pronunciation for a
specific voice to see the default translation for the language of that voice or
for a specific custom model to see the translation for that model.
**See also:** [ a word from a
language](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryLanguage).
**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR_OmarVoice`
voice is deprecated; use the `ar-MS_OmarVoice` voice instead.
327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 327 def get_pronunciation(text:, voice: nil, format: nil, customization_id: nil) raise ArgumentError.new("text must be provided") if text.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_pronunciation") headers.merge!(sdk_headers) params = { "text" => text, "voice" => voice, "format" => format, "customization_id" => customization_id } method_url = "/v1/pronunciation" response = request( method: "GET", url: method_url, headers: headers, params: params, accept_json: true ) response end |
#get_speaker_model(speaker_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a speaker model. Gets information about all prompts that are defined by a specified speaker for all
custom models that are owned by a service instance. The information is grouped by
the customization IDs of the custom models. For each custom model, the information
lists information about each prompt that is defined for that custom model by the
speaker. You must use credentials for the instance of the service that owns a
speaker model to list its prompts. Speaker models and the custom prompts with
which they are used are supported only for use with US English custom models and
voices.
**See also:** [Listing the custom prompts for a speaker
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list-prompts).
1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1218 def get_speaker_model(speaker_id:) raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_speaker_model") headers.merge!(sdk_headers) method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)] response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#get_voice(voice: , customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Get a voice. Gets information about the specified voice. The information includes the name,
language, gender, and other details about the voice. Specify a customization ID to
obtain information for a custom model that is defined for the language of the
specified voice. To list information about all available voices, use the [List
voices](#listvoices) method.
**See also:** [Listing a specific
voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoice).
**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR_OmarVoice`
voice is deprecated; use the `ar-MS_OmarVoice` voice instead.
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 135 def get_voice(voice:, customization_id: nil) raise ArgumentError.new("voice must be provided") if voice.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_voice") headers.merge!(sdk_headers) params = { "customization_id" => customization_id } method_url = "/v1/voices/%s" % [ERB::Util.url_encode(voice)] response = request( method: "GET", url: method_url, headers: headers, params: params, accept_json: true ) response end |
#get_word(customization_id: , word: ) ⇒ IBMCloudSdkCore::DetailedResponse
Get a custom word. Gets the translation for a single word from the specified custom model. The output
shows the translation as it is defined in the model. You must use credentials for
the instance of the service that owns a model to list its words.
**See also:** [ a single word from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordQueryModel).
766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 766 def get_word(customization_id:, word:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? raise ArgumentError.new("word must be provided") if word.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_word") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s/words/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(word)] response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#list_custom_models(language: nil) ⇒ IBMCloudSdkCore::DetailedResponse
List custom models. Lists metadata such as the name and description for all custom models that are
owned by an instance of the service. Specify a language to list the custom models
for that language only. To see the words and prompts in addition to the
for a specific custom model, use the [Get a custom model](#getcustommodel) method.
You must use credentials for the instance of the service that owns a model to list
information about it.
**See also:** [ all custom
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).
428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 428 def list_custom_models(language: nil) headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_custom_models") headers.merge!(sdk_headers) params = { "language" => language } method_url = "/v1/customizations" response = request( method: "GET", url: method_url, headers: headers, params: params, accept_json: true ) response end |
#list_custom_prompts(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
List custom prompts. Lists information about all custom prompts that are defined for a custom model.
The information includes the prompt ID, prompt text, status, and optional speaker
ID for each prompt of the custom model. You must use credentials for the instance
of the service that owns the custom model. The same information about all of the
prompts for a custom model is also provided by the [Get a custom
model](#getcustommodel) method. That method provides complete details about a
specified custom model, including its language, owner, custom words, and more.
Custom prompts are supported only for use with US English custom models and
voices.
**See also:** [Listing custom
prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).
841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 841 def list_custom_prompts(customization_id:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_custom_prompts") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s/prompts" % [ERB::Util.url_encode(customization_id)] response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#list_speaker_models ⇒ IBMCloudSdkCore::DetailedResponse
List speaker models. Lists information about all speaker models that are defined for a service
instance. The information includes the speaker ID and speaker name of each defined
speaker. You must use credentials for the instance of a service to list its
speakers. Speaker models and the custom prompts with which they are used are
supported only for use with US English custom models and voices.
**See also:** [Listing speaker
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list).
1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1091 def list_speaker_models headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_speaker_models") headers.merge!(sdk_headers) method_url = "/v1/speakers" response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#list_voices ⇒ IBMCloudSdkCore::DetailedResponse
List voices. Lists all voices available for use with the service. The information includes the
name, language, gender, and other details about the voice. The ordering of the
list of voices can change from call to call; do not rely on an alphabetized or
static list of voices. To see information about a specific voice, use the [Get a
voice](#getvoice).
**See also:** [Listing all available
voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoices).
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 95 def list_voices headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_voices") headers.merge!(sdk_headers) method_url = "/v1/voices" response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#list_words(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse
List custom words. Lists all of the words and their translations for the specified custom model. The
output shows the translations as they are defined in the model. You must use
credentials for the instance of the service that owns a model to list its words.
**See also:** [ all words from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryModel).
660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 660 def list_words(customization_id:) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_words") headers.merge!(sdk_headers) method_url = "/v1/customizations/%s/words" % [ERB::Util.url_encode(customization_id)] response = request( method: "GET", url: method_url, headers: headers, accept_json: true ) response end |
#synthesize(text: , accept: nil, voice: nil, customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse
Synthesize audio. Synthesizes text to audio that is spoken in the specified voice. The service bases
its understanding of the language for the input text on the specified voice. Use a
voice that matches the language of the input text.
The method accepts a maximum of 5 KB of input text in the body of the request, and
8 KB for the URL and headers. The 5 KB limit includes any SSML that you
specify. The service returns the synthesized audio stream as an array of bytes.
**See also:** [The HTTP
interface](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-usingHTTP#usingHTTP).
**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR_OmarVoice`
voice is deprecated; use the `ar-MS_OmarVoice` voice instead.
### Audio formats (accept types)
The service can return audio in the following formats (MIME types).
* Where indicated, you can optionally specify the sampling rate (`rate`) of the
audio. You must specify a sampling rate for the `audio/l16` and `audio/mulaw`
formats. A specified sampling rate must lie in the range of 8 kHz to 192 kHz. Some
formats restrict the sampling rate to certain values, as noted.
* For the `audio/l16` format, you can optionally specify the endianness
(`endianness`) of the audio: `endianness=big-endian` or
`endianness=little-endian`.
Use the `Accept` header or the `accept` parameter to specify the requested format
of the response audio. If you omit an audio format altogether, the service returns
the audio in Ogg format with the Opus codec (`audio/ogg;codecs=opus`). The service
always returns single-channel audio.
* `audio/basic` - The service returns audio with a sampling rate of 8000 Hz.
* `audio/flac` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/l16` - You must specify the `rate` of the audio. You can optionally
specify the `endianness` of the audio. The default endianness is `little-endian`.
* `audio/mp3` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/mpeg` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/mulaw` - You must specify the `rate` of the audio.
* `audio/ogg` - The service returns the audio in the `vorbis` codec. You can
optionally specify the `rate` of the audio. The default sampling rate is 22,050
Hz.
* `audio/ogg;codecs=opus` - You can optionally specify the `rate` of the audio.
Only the following values are valid sampling rates: `48000`, `24000`, `16000`,
`12000`, or `8000`. If you specify a value other than one of these, the service
returns an error. The default sampling rate is 48,000 Hz.
* `audio/ogg;codecs=vorbis` - You can optionally specify the `rate` of the audio.
The default sampling rate is 22,050 Hz.
* `audio/wav` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/webm` - The service returns the audio in the `opus` codec. The service
returns audio with a sampling rate of 48,000 Hz.
* `audio/webm;codecs=opus` - The service returns audio with a sampling rate of
48,000 Hz.
* `audio/webm;codecs=vorbis` - You can optionally specify the `rate` of the audio.
The default sampling rate is 22,050 Hz.
For more information about an audio format, including additional
details about some of the formats, see [Using audio
formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audio-formats).
### Warning messages
If a request includes invalid query parameters, the service returns a `Warnings`
response header that provides about the invalid parameters. The warning
includes a descriptive and a list of invalid argument strings. For
example, a such as `"Unknown arguments:"` or `"Unknown url query
arguments:"` followed by a list of the form `"{invalid_arg_1}, {invalid_arg_2}."`
The request succeeds despite the warnings.
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 262 def synthesize(text:, accept: nil, voice: nil, customization_id: nil) raise ArgumentError.new("text must be provided") if text.nil? headers = { "Accept" => accept } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "synthesize") headers.merge!(sdk_headers) params = { "voice" => voice, "customization_id" => customization_id } data = { "text" => text } method_url = "/v1/synthesize" response = request( method: "POST", url: method_url, headers: headers, params: params, json: data, accept_json: false ) response end |
#update_custom_model(customization_id: , name: nil, description: nil, words: nil) ⇒ nil
Update a custom model. Updates information for the specified custom model. You can update metadata such
as the name and description of the model. You can also update the words in the
model and their translations. Adding a new translation for a word that already
exists in a custom model overwrites the word's existing translation. A custom
model can contain no more than 20,000 entries. You must use credentials for the
instance of the service that owns a model to update it.
You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation
<code><phoneme alphabet="ipa"
ph="təmˈɑto"></phoneme></code>
or in the proprietary IBM Symbolic Phonetic Representation (SPR)
<code><phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"></phoneme></code>
**See also:**
* [Updating a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsUpdate)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 |
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 489 def update_custom_model(customization_id:, name: nil, description: nil, words: nil) raise ArgumentError.new("customization_id must be provided") if customization_id.nil? headers = { } sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "update_custom_model") headers.merge!(sdk_headers) data = { "name" => name, "description" => description, "words" => words } method_url = "/v1/customizations/%s" % [ERB::Util.url_encode(customization_id)] request( method: "POST", url: method_url, headers: headers, json: data, accept_json: true ) nil end |