Class: IBMWatson::TextToSpeechV1

Inherits:
IBMCloudSdkCore::BaseService
  • Object
show all
Includes:
Concurrent::Async
Defined in:
lib/ibm_watson/text_to_speech_v1.rb

Overview

The Text to Speech V1 service.

Constant Summary collapse

DEFAULT_SERVICE_NAME =
"text_to_speech"
DEFAULT_SERVICE_URL =
"https://api.us-south.text-to-speech.watson.cloud.ibm.com"

Instance Method Summary collapse

Constructor Details

#initialize(args) ⇒ TextToSpeechV1

Construct a new client for the Text to Speech service.

Parameters:

  • args (Hash)

    The args to initialize with

Options Hash (args):

  • service_url (String)

    The base service URL to use when contacting the service. The base service_url may differ between IBM Cloud regions.

  • authenticator (Object)

    The Authenticator instance to be configured for this service.

  • service_name (String)

    The name of the service to configure. Will be used as the key to load any external configuration, if applicable.



66
67
68
69
70
71
72
73
74
75
76
77
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 66

def initialize(args = {})
  @__async_initialized__ = false
  defaults = {}
  defaults[:service_url] = DEFAULT_SERVICE_URL
  defaults[:service_name] = DEFAULT_SERVICE_NAME
  defaults[:authenticator] = nil
  user_service_url = args[:service_url] unless args[:service_url].nil?
  args = defaults.merge(args)
  args[:authenticator] = IBMCloudSdkCore::ConfigBasedAuthenticatorFactory.new.get_authenticator(service_name: args[:service_name]) if args[:authenticator].nil?
  super
  @service_url = user_service_url unless user_service_url.nil?
end

Instance Method Details

#add_custom_prompt(customization_id: , prompt_id: , metadata: , file: ) ⇒ IBMCloudSdkCore::DetailedResponse

Add a custom prompt. Adds a custom prompt to a custom model. A prompt is defined by the text that is to

be spoken, the audio for that text, a unique user-specified ID for the prompt, and
an optional speaker ID. The information is used to generate prosodic data that is
not visible to the user. This data is used by the service to produce the
synthesized audio upon request. You must use credentials for the instance of the
service that owns a custom model to add a prompt to it. You can add a maximum of
1000 custom prompts to a single custom model.

You are recommended to assign meaningful values for prompt IDs. For example, use
`goodbye` to identify a prompt that speaks a farewell message. Prompt IDs must be
unique within a given custom model. You cannot define two prompts with the same
name for the same custom model. If you provide the ID of an existing prompt, the
previously uploaded prompt is replaced by the new information. The existing prompt
is reprocessed by using the new text and audio and, if provided, new speaker
model, and the prosody data associated with the prompt is updated.

The quality of a prompt is undefined if the language of a prompt does not match
the language of its custom model. This is consistent with any text or SSML that is
specified for a speech synthesis request. The service makes a best-effort attempt
to render the specified text for the prompt; it does not validate that the
language of the text matches the language of the model.

Adding a prompt is an asynchronous operation. Although it accepts less audio than
speaker enrollment, the service must align the audio with the provided text. The
time that it takes to process a prompt depends on the prompt itself. The
processing time for a reasonably sized prompt generally matches the length of the
audio (for example, it takes 20 seconds to process a 20-second prompt).

For shorter prompts, you can wait for a reasonable amount of time and then check
the status of the prompt with the [Get a custom prompt](#getcustomprompt) method.
For longer prompts, consider using that method to poll the service every few
seconds to determine when the prompt becomes available. No prompt can be used for
speech synthesis if it is in the `processing` or `failed` state. Only prompts that
are in the `available` state can be used for speech synthesis.

When it processes a request, the service attempts to align the text and the audio
that are provided for the prompt. The text that is passed with a prompt must match
the spoken audio as closely as possible. Optimally, the text and audio match
exactly. The service does its best to align the specified text with the audio, and
it can often compensate for mismatches between the two. But if the service cannot
effectively align the text and the audio, possibly because the magnitude of
mismatches between the two is too great, processing of the prompt fails.

### Evaluating a prompt

 Always listen to and evaluate a prompt to determine its quality before using it
in production. To evaluate a prompt, include only the single prompt in a speech
synthesis request by using the following SSML extension, in this case for a prompt
whose ID is `goodbye`:

`<ibm:prompt id="goodbye"/>`

In some cases, you might need to rerecord and resubmit a prompt as many as five
times to address the following possible problems:
* The service might fail to detect a mismatch between the prompts text and audio.
The longer the prompt, the greater the chance for misalignment between its text
and audio. Therefore, multiple shorter prompts are preferable to a single long
prompt.
* The text of a prompt might include a word that the service does not recognize.
In this case, you can create a custom word and pronunciation pair to tell the
service how to pronounce the word. You must then re-create the prompt.
* The quality of the input audio might be insufficient or the services processing
of the audio might fail to detect the intended prosody. Submitting new audio for
the prompt can correct these issues.

If a prompt that is created without a speaker ID does not adequately reflect the
intended prosody, enrolling the speaker and providing a speaker ID for the prompt
is one recommended means of potentially improving the quality of the prompt. This
is especially important for shorter prompts such as "good-bye" or "thank you,"
where less audio data makes it more difficult to match the prosody of the speaker.
Custom prompts are supported only for use with US English custom models and
voices.

**See also:**
* [Add a custom
prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-add-prompt)
* [Evaluate a custom
prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-evaluate-prompt)
* [Rules for creating custom
prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-prompts).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • prompt_id (String) (defaults to: )

    The identifier of the prompt that is to be added to the custom model:

    • Include a maximum of 49 characters in the ID.

    • Include only alphanumeric characters and ‘_` (underscores) in the ID.

    • Do not include XML sensitive characters (double quotes, single quotes,

    ampersands, angle brackets, and slashes) in the ID.

    • To add a new prompt, the ID must be unique for the specified custom model.

    Otherwise, the new information for the prompt overwrites the existing prompt that has that ID.

  • metadata (PromptMetadata) (defaults to: )

    Information about the prompt that is to be added to a custom model. The following example of a ‘PromptMetadata` object includes both the required prompt text and an optional speaker model ID:

    ‘{ “prompt_text”: “Thank you and good-bye!”, “speaker_id”: “823068b2-ed4e-11ea-b6e0-7b6456aa95cc” }`.

  • file (File) (defaults to: )

    An audio file that speaks the text of the prompt with intonation and prosody that matches how you would like the prompt to be spoken.

    • The prompt audio must be in WAV format and must have a minimum sampling rate of

    16 kHz. The service accepts audio with higher sampling rates. The service transcodes all audio to 16 kHz before processing it.

    • The length of the prompt audio is limited to 30 seconds.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 966

def add_custom_prompt(customization_id:, prompt_id:, metadata:, file:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?

  raise ArgumentError.new("metadata must be provided") if .nil?

  raise ArgumentError.new("file must be provided") if file.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_custom_prompt")
  headers.merge!(sdk_headers)

  form_data = {}

  form_data[:metadata] = HTTP::FormData::Part.new(.to_s, content_type: "application/json")

  unless file.instance_of?(StringIO) || file.instance_of?(File)
    file = file.respond_to?(:to_json) ? StringIO.new(file.to_json) : StringIO.new(file)
  end
  form_data[:file] = HTTP::FormData::File.new(file, content_type: "audio/wav", filename: file.respond_to?(:path) ? file.path : nil)

  method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]

  response = request(
    method: "POST",
    url: method_url,
    headers: headers,
    form: form_data,
    accept_json: true
  )
  response
end

#add_word(customization_id: , word: , translation: , part_of_speech: nil) ⇒ nil

Add a custom word. Adds a single word and its translation to the specified custom model. Adding a new

translation for a word that already exists in a custom model overwrites the word's
existing translation. A custom model can contain no more than 20,000 entries. You
must use credentials for the instance of the service that owns a model to add a
word to it.

You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation

  <code>&lt;phoneme alphabet="ipa"
ph="t&#601;m&#712;&#593;to"&gt;&lt;/phoneme&gt;</code>

  or in the proprietary IBM Symbolic Phonetic Representation (SPR)

  <code>&lt;phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"&gt;&lt;/phoneme&gt;</code>

**See also:**
* [Adding a single word to a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordAdd)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • word (String) (defaults to: )

    The word that is to be added or updated for the custom model.

  • translation (String) (defaults to: )

    The phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA translation or as an IBM SPR translation. The Arabic, Chinese, Dutch, Australian English, and Korean languages support only IPA. A sounds-like is one or more words that, when combined, sound like the word.

  • part_of_speech (String) (defaults to: nil)

    **Japanese only.** The part of speech for the word. The service uses the value to produce the correct intonation for the word. You can create only a single entry, with or without a single part of speech, for any word; you cannot create multiple entries with different parts of speech for the same word. For more information, see [Working with Japanese entries](cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-rules#jaNotes).

Returns:

  • (nil)

Raises:

  • (ArgumentError)


724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 724

def add_word(customization_id:, word:, translation:, part_of_speech: nil)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("word must be provided") if word.nil?

  raise ArgumentError.new("translation must be provided") if translation.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_word")
  headers.merge!(sdk_headers)

  data = {
    "translation" => translation,
    "part_of_speech" => part_of_speech
  }

  method_url = "/v1/customizations/%s/words/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(word)]

  request(
    method: "PUT",
    url: method_url,
    headers: headers,
    json: data,
    accept_json: false
  )
  nil
end

#add_words(customization_id: , words: ) ⇒ nil

Add custom words. Adds one or more words and their translations to the specified custom model.

Adding a new translation for a word that already exists in a custom model
overwrites the word's existing translation. A custom model can contain no more
than 20,000 entries. You must use credentials for the instance of the service that
owns a model to add words to it.

You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation

  <code>&lt;phoneme alphabet="ipa"
ph="t&#601;m&#712;&#593;to"&gt;&lt;/phoneme&gt;</code>

  or in the proprietary IBM Symbolic Phonetic Representation (SPR)

  <code>&lt;phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"&gt;&lt;/phoneme&gt;</code>

**See also:**
* [Adding multiple words to a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsAdd)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • words (Array[Word]) (defaults to: )

    The [Add custom words](#addwords) method accepts an array of ‘Word` objects. Each object provides a word that is to be added or updated for the custom model and the word’s translation.

    The [List custom words](#listwords) method returns an array of ‘Word` objects. Each object shows a word and its translation from the custom model. The words are listed in alphabetical order, with uppercase letters listed before lowercase letters. The array is empty if the custom model contains no words.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 622

def add_words(customization_id:, words:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("words must be provided") if words.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_words")
  headers.merge!(sdk_headers)

  data = {
    "words" => words
  }

  method_url = "/v1/customizations/%s/words" % [ERB::Util.url_encode(customization_id)]

  request(
    method: "POST",
    url: method_url,
    headers: headers,
    json: data,
    accept_json: true
  )
  nil
end

#create_custom_model(name: , language: nil, description: nil) ⇒ IBMCloudSdkCore::DetailedResponse

Create a custom model. Creates a new empty custom model. You must specify a name for the new custom

model. You can optionally specify the language and a description for the new
model. The model is owned by the instance of the service whose credentials are
used to create it.

**See also:** [Creating a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsCreate).

**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR` language
identifier cannot be used to create a custom model; use the `ar-MS` identifier
instead.

Parameters:

  • name (String) (defaults to: )

    The name of the new custom model.

  • language (String) (defaults to: nil)

    The language of the new custom model. You create a custom model for a specific language, not for a specific voice. A custom model can be used with any voice for its specified language. Omit the parameter to use the the default language, ‘en-US`.

    Important: If you are using the service on IBM Cloud Pak for Data and you install the neural voices, the ‘language`parameter is required. You must specify the language for the custom model in the indicated format (for example, `en-AU` for Australian English). The request fails if you do not specify a language.

  • description (String) (defaults to: nil)

    A description of the new custom model. Specifying a description is recommended.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 386

def create_custom_model(name:, language: nil, description: nil)
  raise ArgumentError.new("name must be provided") if name.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "create_custom_model")
  headers.merge!(sdk_headers)

  data = {
    "name" => name,
    "language" => language,
    "description" => description
  }

  method_url = "/v1/customizations"

  response = request(
    method: "POST",
    url: method_url,
    headers: headers,
    json: data,
    accept_json: true
  )
  response
end

#create_speaker_model(speaker_name: , audio: ) ⇒ IBMCloudSdkCore::DetailedResponse

Create a speaker model. Creates a new speaker model, which is an optional enrollment token for users who

are to add prompts to custom models. A speaker model contains information about a
user's voice. The service extracts this information from a WAV audio sample that
you pass as the body of the request. Associating a speaker model with a prompt is
optional, but the information that is extracted from the speaker model helps the
service learn about the speaker's voice.

A speaker model can make an appreciable difference in the quality of prompts,
especially short prompts with relatively little audio, that are associated with
that speaker. A speaker model can help the service produce a prompt with more
confidence; the lack of a speaker model can potentially compromise the quality of
a prompt.

The gender of the speaker who creates a speaker model does not need to match the
gender of a voice that is used with prompts that are associated with that speaker
model. For example, a speaker model that is created by a male speaker can be
associated with prompts that are spoken by female voices.

You create a speaker model for a given instance of the service. The new speaker
model is owned by the service instance whose credentials are used to create it.
That same speaker can then be used to create prompts for all custom models within
that service instance. No language is associated with a speaker model, but each
custom model has a single specified language. You can add prompts only to US
English models.

You specify a name for the speaker when you create it. The name must be unique
among all speaker names for the owning service instance. To re-create a speaker
model for an existing speaker name, you must first delete the existing speaker
model that has that name.

Speaker enrollment is a synchronous operation. Although it accepts more audio data
than a prompt, the process of adding a speaker is very fast. The service simply
extracts information about the speakers voice from the audio. Unlike prompts,
speaker models neither need nor accept a transcription of the audio. When the call
returns, the audio is fully processed and the speaker enrollment is complete.

The service returns a speaker ID with the request. A speaker ID is globally unique
identifier (GUID) that you use to identify the speaker in subsequent requests to
the service. Speaker models and the custom prompts with which they are used are
supported only for use with US English custom models and voices.

**See also:**
* [Create a speaker
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-speaker-model)
* [Rules for creating speaker
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-speakers).

Parameters:

  • speaker_name (String) (defaults to: )

    The name of the speaker that is to be added to the service instance.

    • Include a maximum of 49 characters in the name.

    • Include only alphanumeric characters and ‘_` (underscores) in the name.

    • Do not include XML sensitive characters (double quotes, single quotes,

    ampersands, angle brackets, and slashes) in the name.

    • Do not use the name of an existing speaker that is already defined for the

    service instance.

  • audio (File) (defaults to: )

    An enrollment audio file that contains a sample of the speakers voice.

    • The enrollment audio must be in WAV format and must have a minimum sampling rate

    of 16 kHz. The service accepts audio with higher sampling rates. It transcodes all audio to 16 kHz before processing it.

    • The length of the enrollment audio is limited to 1 minute. Speaking one or two

    paragraphs of text that include five to ten sentences is recommended.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1171

def create_speaker_model(speaker_name:, audio:)
  raise ArgumentError.new("speaker_name must be provided") if speaker_name.nil?

  raise ArgumentError.new("audio must be provided") if audio.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "create_speaker_model")
  headers.merge!(sdk_headers)

  params = {
    "speaker_name" => speaker_name
  }

  data = audio
  headers["Content-Type"] = "audio/wav"

  method_url = "/v1/speakers"

  response = request(
    method: "POST",
    url: method_url,
    headers: headers,
    params: params,
    data: data,
    accept_json: true
  )
  response
end

#delete_custom_model(customization_id: ) ⇒ nil

Delete a custom model. Deletes the specified custom model. You must use credentials for the instance of

the service that owns a model to delete it.

**See also:** [Deleting a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsDelete).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 559

def delete_custom_model(customization_id:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_custom_model")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s" % [ERB::Util.url_encode(customization_id)]

  request(
    method: "DELETE",
    url: method_url,
    headers: headers,
    accept_json: false
  )
  nil
end

#delete_custom_prompt(customization_id: , prompt_id: ) ⇒ nil

Delete a custom prompt. Deletes an existing custom prompt from a custom model. The service deletes the

prompt with the specified ID. You must use credentials for the instance of the
service that owns the custom model from which the prompt is to be deleted.

**Caution:** Deleting a custom prompt elicits a 400 response code from synthesis
requests that attempt to use the prompt. Make sure that you do not attempt to use
a deleted prompt in a production application. Custom prompts are supported only
for use with US English custom models and voices.

**See also:** [Deleting a custom
prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-delete).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • prompt_id (String) (defaults to: )

    The identifier (name) of the prompt that is to be deleted.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1055

def delete_custom_prompt(customization_id:, prompt_id:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_custom_prompt")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]

  request(
    method: "DELETE",
    url: method_url,
    headers: headers,
    accept_json: false
  )
  nil
end

#delete_speaker_model(speaker_id: ) ⇒ nil

Delete a speaker model. Deletes an existing speaker model from the service instance. The service deletes

the enrolled speaker with the specified speaker ID. You must use credentials for
the instance of the service that owns a speaker model to delete the speaker.

Any prompts that are associated with the deleted speaker are not affected by the
speaker's deletion. The prosodic data that defines the quality of a prompt is
established when the prompt is created. A prompt is static and remains unaffected
by deletion of its associated speaker. However, the prompt cannot be resubmitted
or updated with its original speaker once that speaker is deleted. Speaker models
and the custom prompts with which they are used are supported only for use with US
English custom models and voices.

**See also:** [Deleting a speaker
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-delete).

Parameters:

  • speaker_id (String) (defaults to: )

    The speaker ID (GUID) of the speaker model. You must make the request with service credentials for the instance of the service that owns the speaker model.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1257

def delete_speaker_model(speaker_id:)
  raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_speaker_model")
  headers.merge!(sdk_headers)

  method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)]

  request(
    method: "DELETE",
    url: method_url,
    headers: headers,
    accept_json: false
  )
  nil
end

#delete_user_data(customer_id: ) ⇒ nil

Delete labeled data. Deletes all data that is associated with a specified customer ID. The method

deletes all data for the customer ID, regardless of the method by which the
information was added. The method has no effect if no data is associated with the
customer ID. You must issue the request with credentials for the same instance of
the service that was used to associate the customer ID with the data. You
associate a customer ID with data by passing the `X-Watson-Metadata` header with a
request that passes the data.

**Note:** If you delete an instance of the service from the service console, all
data associated with that service instance is automatically deleted. This includes
all custom models and word/translation pairs, and all data related to speech
synthesis requests.

**See also:** [Information
security](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-information-security#information-security).

Parameters:

  • customer_id (String) (defaults to: )

    The customer ID for which all data is to be deleted.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1299

def delete_user_data(customer_id:)
  raise ArgumentError.new("customer_id must be provided") if customer_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_user_data")
  headers.merge!(sdk_headers)

  params = {
    "customer_id" => customer_id
  }

  method_url = "/v1/user_data"

  request(
    method: "DELETE",
    url: method_url,
    headers: headers,
    params: params,
    accept_json: false
  )
  nil
end

#delete_word(customization_id: , word: ) ⇒ nil

Delete a custom word. Deletes a single word from the specified custom model. You must use credentials

for the instance of the service that owns a model to delete its words.

**See also:** [Deleting a word from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordDelete).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • word (String) (defaults to: )

    The word that is to be deleted from the custom model.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 799

def delete_word(customization_id:, word:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("word must be provided") if word.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_word")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s/words/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(word)]

  request(
    method: "DELETE",
    url: method_url,
    headers: headers,
    accept_json: false
  )
  nil
end

#get_custom_model(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse

Get a custom model. Gets all information about a specified custom model. In addition to metadata such

as the name and description of the custom model, the output includes the words and
their translations that are defined for the model, as well as any prompts that are
defined for the model. To see just the  for a model, use the [List custom
models](#listcustommodels) method.

**See also:** [Querying a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQuery).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 529

def get_custom_model(customization_id:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_custom_model")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s" % [ERB::Util.url_encode(customization_id)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#get_custom_prompt(customization_id: , prompt_id: ) ⇒ IBMCloudSdkCore::DetailedResponse

Get a custom prompt. Gets information about a specified custom prompt for a specified custom model. The

information includes the prompt ID, prompt text, status, and optional speaker ID
for each prompt of the custom model. You must use credentials for the instance of
the service that owns the custom model. Custom prompts are supported only for use
with US English custom models and voices.

**See also:** [Listing custom
prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • prompt_id (String) (defaults to: )

    The identifier (name) of the prompt.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1016

def get_custom_prompt(customization_id:, prompt_id:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_custom_prompt")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#get_pronunciation(text: , voice: nil, format: nil, customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse

Get pronunciation. Gets the phonetic pronunciation for the specified word. You can request the

pronunciation for a specific format. You can also request the pronunciation for a
specific voice to see the default translation for the language of that voice or
for a specific custom model to see the translation for that model.

**See also:** [Querying a word from a
language](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryLanguage).

**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR_OmarVoice`
voice is deprecated; use the `ar-MS_OmarVoice` voice instead.

Parameters:

  • text (String) (defaults to: )

    The word for which the pronunciation is requested.

  • voice (String) (defaults to: nil)

    A voice that specifies the language in which the pronunciation is to be returned. All voices for the same language (for example, ‘en-US`) return the same translation.

  • format (String) (defaults to: nil)

    The phoneme format in which to return the pronunciation. The Arabic, Chinese, Dutch, Australian English, and Korean languages support only IPA. Omit the parameter to obtain the pronunciation in the default format.

  • customization_id (String) (defaults to: nil)

    The customization ID (GUID) of a custom model for which the pronunciation is to be returned. The language of a specified custom model must match the language of the specified voice. If the word is not defined in the specified custom model, the service returns the default translation for the custom model’s language. You must make the request with credentials for the instance of the service that owns the custom model. Omit the parameter to see the translation for the specified voice with no customization.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 327

def get_pronunciation(text:, voice: nil, format: nil, customization_id: nil)
  raise ArgumentError.new("text must be provided") if text.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_pronunciation")
  headers.merge!(sdk_headers)

  params = {
    "text" => text,
    "voice" => voice,
    "format" => format,
    "customization_id" => customization_id
  }

  method_url = "/v1/pronunciation"

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    params: params,
    accept_json: true
  )
  response
end

#get_speaker_model(speaker_id: ) ⇒ IBMCloudSdkCore::DetailedResponse

Get a speaker model. Gets information about all prompts that are defined by a specified speaker for all

custom models that are owned by a service instance. The information is grouped by
the customization IDs of the custom models. For each custom model, the information
lists information about each prompt that is defined for that custom model by the
speaker. You must use credentials for the instance of the service that owns a
speaker model to list its prompts. Speaker models and the custom prompts with
which they are used are supported only for use with US English custom models and
voices.

**See also:** [Listing the custom prompts for a speaker
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list-prompts).

Parameters:

  • speaker_id (String) (defaults to: )

    The speaker ID (GUID) of the speaker model. You must make the request with service credentials for the instance of the service that owns the speaker model.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1218

def get_speaker_model(speaker_id:)
  raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_speaker_model")
  headers.merge!(sdk_headers)

  method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#get_voice(voice: , customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse

Get a voice. Gets information about the specified voice. The information includes the name,

language, gender, and other details about the voice. Specify a customization ID to
obtain information for a custom model that is defined for the language of the
specified voice. To list information about all available voices, use the [List
voices](#listvoices) method.

**See also:** [Listing a specific
voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoice).

**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR_OmarVoice`
voice is deprecated; use the `ar-MS_OmarVoice` voice instead.

Parameters:

  • voice (String) (defaults to: )

    The voice for which information is to be returned.

  • customization_id (String) (defaults to: nil)

    The customization ID (GUID) of a custom model for which information is to be returned. You must make the request with credentials for the instance of the service that owns the custom model. Omit the parameter to see information about the specified voice with no customization.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 135

def get_voice(voice:, customization_id: nil)
  raise ArgumentError.new("voice must be provided") if voice.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_voice")
  headers.merge!(sdk_headers)

  params = {
    "customization_id" => customization_id
  }

  method_url = "/v1/voices/%s" % [ERB::Util.url_encode(voice)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    params: params,
    accept_json: true
  )
  response
end

#get_word(customization_id: , word: ) ⇒ IBMCloudSdkCore::DetailedResponse

Get a custom word. Gets the translation for a single word from the specified custom model. The output

shows the translation as it is defined in the model. You must use credentials for
the instance of the service that owns a model to list its words.

**See also:** [Querying a single word from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordQueryModel).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • word (String) (defaults to: )

    The word that is to be queried from the custom model.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 766

def get_word(customization_id:, word:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  raise ArgumentError.new("word must be provided") if word.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_word")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s/words/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(word)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#list_custom_models(language: nil) ⇒ IBMCloudSdkCore::DetailedResponse

List custom models. Lists metadata such as the name and description for all custom models that are

owned by an instance of the service. Specify a language to list the custom models
for that language only. To see the words and prompts in addition to the 
for a specific custom model, use the [Get a custom model](#getcustommodel) method.
You must use credentials for the instance of the service that owns a model to list
information about it.

**See also:** [Querying all custom
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).

Parameters:

  • language (String) (defaults to: nil)

    The language for which custom models that are owned by the requesting credentials are to be returned. Omit the parameter to see all custom models that are owned by the requester.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.



428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 428

def list_custom_models(language: nil)
  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_custom_models")
  headers.merge!(sdk_headers)

  params = {
    "language" => language
  }

  method_url = "/v1/customizations"

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    params: params,
    accept_json: true
  )
  response
end

#list_custom_prompts(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse

List custom prompts. Lists information about all custom prompts that are defined for a custom model.

The information includes the prompt ID, prompt text, status, and optional speaker
ID for each prompt of the custom model. You must use credentials for the instance
of the service that owns the custom model. The same information about all of the
prompts for a custom model is also provided by the [Get a custom
model](#getcustommodel) method. That method provides complete details about a
specified custom model, including its language, owner, custom words, and more.
Custom prompts are supported only for use with US English custom models and
voices.

**See also:** [Listing custom
prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 841

def list_custom_prompts(customization_id:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_custom_prompts")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s/prompts" % [ERB::Util.url_encode(customization_id)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#list_speaker_modelsIBMCloudSdkCore::DetailedResponse

List speaker models. Lists information about all speaker models that are defined for a service

instance. The information includes the speaker ID and speaker name of each defined
speaker. You must use credentials for the instance of a service to list its
speakers. Speaker models and the custom prompts with which they are used are
supported only for use with US English custom models and voices.

**See also:** [Listing speaker
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list).

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.



1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 1091

def list_speaker_models
  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_speaker_models")
  headers.merge!(sdk_headers)

  method_url = "/v1/speakers"

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#list_voicesIBMCloudSdkCore::DetailedResponse

List voices. Lists all voices available for use with the service. The information includes the

name, language, gender, and other details about the voice. The ordering of the
list of voices can change from call to call; do not rely on an alphabetized or
static list of voices. To see information about a specific voice, use the [Get a
voice](#getvoice).

**See also:** [Listing all available
voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoices).

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.



95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 95

def list_voices
  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_voices")
  headers.merge!(sdk_headers)

  method_url = "/v1/voices"

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#list_words(customization_id: ) ⇒ IBMCloudSdkCore::DetailedResponse

List custom words. Lists all of the words and their translations for the specified custom model. The

output shows the translations as they are defined in the model. You must use
credentials for the instance of the service that owns a model to list its words.

**See also:** [Querying all words from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryModel).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 660

def list_words(customization_id:)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_words")
  headers.merge!(sdk_headers)

  method_url = "/v1/customizations/%s/words" % [ERB::Util.url_encode(customization_id)]

  response = request(
    method: "GET",
    url: method_url,
    headers: headers,
    accept_json: true
  )
  response
end

#synthesize(text: , accept: nil, voice: nil, customization_id: nil) ⇒ IBMCloudSdkCore::DetailedResponse

Synthesize audio. Synthesizes text to audio that is spoken in the specified voice. The service bases

its understanding of the language for the input text on the specified voice. Use a
voice that matches the language of the input text.

The method accepts a maximum of 5 KB of input text in the body of the request, and
8 KB for the URL and headers. The 5 KB limit includes any SSML tags that you
specify. The service returns the synthesized audio stream as an array of bytes.

**See also:** [The HTTP
interface](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-usingHTTP#usingHTTP).

**Note:** The Arabic, Chinese, Czech, Dutch (Belgian and Netherlands), Australian
English, Korean, and Swedish languages and voices are supported only for IBM
Cloud; they are deprecated for IBM Cloud Pak for Data. Also, the `ar-AR_OmarVoice`
voice is deprecated; use the `ar-MS_OmarVoice` voice instead.

### Audio formats (accept types)

 The service can return audio in the following formats (MIME types).
* Where indicated, you can optionally specify the sampling rate (`rate`) of the
audio. You must specify a sampling rate for the `audio/l16` and `audio/mulaw`
formats. A specified sampling rate must lie in the range of 8 kHz to 192 kHz. Some
formats restrict the sampling rate to certain values, as noted.
* For the `audio/l16` format, you can optionally specify the endianness
(`endianness`) of the audio: `endianness=big-endian` or
`endianness=little-endian`.

Use the `Accept` header or the `accept` parameter to specify the requested format
of the response audio. If you omit an audio format altogether, the service returns
the audio in Ogg format with the Opus codec (`audio/ogg;codecs=opus`). The service
always returns single-channel audio.
* `audio/basic` - The service returns audio with a sampling rate of 8000 Hz.
* `audio/flac` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/l16` - You must specify the `rate` of the audio. You can optionally
specify the `endianness` of the audio. The default endianness is `little-endian`.
* `audio/mp3` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/mpeg` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/mulaw` - You must specify the `rate` of the audio.
* `audio/ogg` - The service returns the audio in the `vorbis` codec. You can
optionally specify the `rate` of the audio. The default sampling rate is 22,050
Hz.
* `audio/ogg;codecs=opus` - You can optionally specify the `rate` of the audio.
Only the following values are valid sampling rates: `48000`, `24000`, `16000`,
`12000`, or `8000`. If you specify a value other than one of these, the service
returns an error. The default sampling rate is 48,000 Hz.
* `audio/ogg;codecs=vorbis` - You can optionally specify the `rate` of the audio.
The default sampling rate is 22,050 Hz.
* `audio/wav` - You can optionally specify the `rate` of the audio. The default
sampling rate is 22,050 Hz.
* `audio/webm` - The service returns the audio in the `opus` codec. The service
returns audio with a sampling rate of 48,000 Hz.
* `audio/webm;codecs=opus` - The service returns audio with a sampling rate of
48,000 Hz.
* `audio/webm;codecs=vorbis` - You can optionally specify the `rate` of the audio.
The default sampling rate is 22,050 Hz.

For more information about specifying an audio format, including additional
details about some of the formats, see [Using audio
formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audio-formats).

### Warning messages

 If a request includes invalid query parameters, the service returns a `Warnings`
response header that provides messages about the invalid parameters. The warning
includes a descriptive message and a list of invalid argument strings. For
example, a message such as `"Unknown arguments:"` or `"Unknown url query
arguments:"` followed by a list of the form `"{invalid_arg_1}, {invalid_arg_2}."`
The request succeeds despite the warnings.

Parameters:

  • text (String) (defaults to: )

    The text to synthesize.

  • accept (String) (defaults to: nil)

    The requested format (MIME type) of the audio. You can use the ‘Accept` header or the `accept` parameter to specify the audio format. For more information about specifying an audio format, see **Audio formats (accept types)** in the method description.

  • voice (String) (defaults to: nil)

    The voice to use for synthesis. If you omit the ‘voice` parameter, the service uses a default voice, which depends on the version of the service that you are using:

    • _For IBM Cloud,_ the service always uses the US English ‘en-US_MichaelV3Voice`

    by default.

    • _For IBM Cloud Pak for Data,_ the default voice depends on the voices that you

    installed. If you installed the _enhanced neural voices_, the service uses the US English ‘en-US_MichaelV3Voice` by default; if that voice is not installed, you must specify a voice. If you installed the _neural voices_, the service always uses the Australian English `en-AU_MadisonVoice` by default.

    **See also:** See also [Using languages and voices](cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices).

  • customization_id (String) (defaults to: nil)

    The customization ID (GUID) of a custom model to use for the synthesis. If a custom model is specified, it works only if it matches the language of the indicated voice. You must make the request with credentials for the instance of the service that owns the custom model. Omit the parameter to use the specified voice with no customization.

Returns:

  • (IBMCloudSdkCore::DetailedResponse)

    A ‘IBMCloudSdkCore::DetailedResponse` object representing the response.

Raises:

  • (ArgumentError)


262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 262

def synthesize(text:, accept: nil, voice: nil, customization_id: nil)
  raise ArgumentError.new("text must be provided") if text.nil?

  headers = {
    "Accept" => accept
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "synthesize")
  headers.merge!(sdk_headers)

  params = {
    "voice" => voice,
    "customization_id" => customization_id
  }

  data = {
    "text" => text
  }

  method_url = "/v1/synthesize"

  response = request(
    method: "POST",
    url: method_url,
    headers: headers,
    params: params,
    json: data,
    accept_json: false
  )
  response
end

#update_custom_model(customization_id: , name: nil, description: nil, words: nil) ⇒ nil

Update a custom model. Updates information for the specified custom model. You can update metadata such

as the name and description of the model. You can also update the words in the
model and their translations. Adding a new translation for a word that already
exists in a custom model overwrites the word's existing translation. A custom
model can contain no more than 20,000 entries. You must use credentials for the
instance of the service that owns a model to update it.

You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation

  <code>&lt;phoneme alphabet="ipa"
ph="t&#601;m&#712;&#593;to"&gt;&lt;/phoneme&gt;</code>

  or in the proprietary IBM Symbolic Phonetic Representation (SPR)

  <code>&lt;phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"&gt;&lt;/phoneme&gt;</code>

**See also:**
* [Updating a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsUpdate)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).

Parameters:

  • customization_id (String) (defaults to: )

    The customization ID (GUID) of the custom model. You must make the request with credentials for the instance of the service that owns the custom model.

  • name (String) (defaults to: nil)

    A new name for the custom model.

  • description (String) (defaults to: nil)

    A new description for the custom model.

  • words (Array[Word]) (defaults to: nil)

    An array of ‘Word` objects that provides the words and their translations that are to be added or updated for the custom model. Pass an empty array to make no additions or updates.

Returns:

  • (nil)

Raises:

  • (ArgumentError)


489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
# File 'lib/ibm_watson/text_to_speech_v1.rb', line 489

def update_custom_model(customization_id:, name: nil, description: nil, words: nil)
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?

  headers = {
  }
  sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "update_custom_model")
  headers.merge!(sdk_headers)

  data = {
    "name" => name,
    "description" => description,
    "words" => words
  }

  method_url = "/v1/customizations/%s" % [ERB::Util.url_encode(customization_id)]

  request(
    method: "POST",
    url: method_url,
    headers: headers,
    json: data,
    accept_json: true
  )
  nil
end