TTS: Synthesizing Voice from Text and Playing Synthesized Sound Data
This tutorial demonstrates how you can synthesize text into sound data as utterance and play, pause, and stop it.
Warm-up
Become familiar with the TTS API basics by learning about:
- Set-up
-
Creating and Destroying TTS Handles
Create and destroy the TTS handle.
-
Setting and Unsetting Callbacks
Set and unset callbacks for obtaining notifications, such as when playing utterance is started or completed.
-
Getting Information
Get information on the supported voice, current state, and voice.
-
Getting and Setting the Mode
Get and set the TTS mode.
-
Connecting and Disconnecting TTS
Connect and disconnect the TTS.
-
Creating and Destroying TTS Handles
-
Adding Text
Request to add text for TTS play.
-
Starting, Stopping, and Pausing Playback
Start TTS playback, stop, and pause it.
Creating and Destroying TTS Handles
To create and destroy TTS handles:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application:
#include <tts.h>
-
To use the TTS library, create a TTS handle. The TTS handle is used for other TTS functions as a parameter. After the creation, the TTS state changes to TTS_STATE_CREATED.
Note TTS is not thread-safe and depends on the ecore main loop. Therefore, you must have the ecore main loop. Do not use TTS in a thread. void create_tts_handle() { tts_h tts; int ret; ret = tts_create(&tts); if (TTS_ERROR_NENE != result) { // Error handling } }
-
When you do not need to use the TTS library anymore, destroy the TTS handle using the tts_destroy() function:
Note Do not use the tts_destroy() function within the callback function, or the tts_destroy() function fails and returns TTS_ERROR_OPERATION_FAILED. void destroy_tts_handle(tts_h tts) { int ret; ret = tts_destroy(tts); // tts is the TTS handle if (TTS_ERROR_NONE != result) { // Error handling } }
Setting and Unsetting Callbacks
To set and unset callbacks:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application. The enum values for the parameters of the callback functions are defined there. You can also find the parameter details in the header file.
#include <tts.h>
-
The TTS API provides various callback functions used to get information, such as changing states and starting a completing utterance. Call the setting and unsetting callback functions in the TTS_STATE_CREATED state.
You can use the following callbacks:
- State changed
If you set the state change callback for the TTS, it is invoked when the TTS state changes.
void state_changed_cb(tts_h tts, tts_state_e previous, tts_state_e current, void* user_data) { // Your code } void set_state_changed_cb(tts_h tts) { int ret; ret = tts_set_state_changed_cb(tts, state_changed_cb, NULL); if (TTS_ERROR_NONE != ret) { // Error handling } } void unset_state_changed_cb(tts_h tts) { int ret; ret = tts_unset_state_changed_cb(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
- Default voice changed
In the TTS library, the voice includes the language used and the voice type, such as male or female. The default voice of the TTS is changed either when the system language is changed, or from the TTS settings. You can get a notification of this change:
void default_voice_changed_cb(tts_h tts, const char* previous_language, int previous_voice_type, const char* current_language, int current_voice_type, void* user_data) { // Your code } void set_default_voice_changed_cb(tts_h tts) { int ret; ret = tts_set_default_voice_changed_cb(tts, default_voice_changed_cb, NULL); if (TTS_ERROR_NONE != ret) { // Error handling } } void unset_default_language_changed_cb(tts_h tts) { int ret; ret = tts_unset_default_voice_changed_cb(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
- Utterance started or completed
If you add text in the TTS, that text is handled as an utterance and it obtains its own ID. After you request starting, the text is synthesized by an engine and played out. You can get a notification of an utterance has starting or completing:
void utterance_started_cb(tts_h tts, int utt_id, void* user_data) { // Your code } void utterance_completed_cb(tts_h tts, int utt_id, void* user_data) { // Your code } void set_utterance_cb(tts_h tts) { int ret; ret = tts_set_utterance_started_cb(tts, utterance_started_cb, NULL); if (TTS_ERROR_NONE != ret) { // Error handling } ret = tts_set_utterance_completed_cb(tts, utterance_completed_cb, NULL); if (TTS_ERROR_NONE != ret) { // Error handling } } void unset_utterance_cb(tts_h tts) { int ret; ret = tts_unset_utterance_started_cb(tts); if (TTS_ERROR_NONE != ret) { // Error handling } ret = tts_unset_utterance_completed_cb(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
- Error
When an error occurs, the TTS library sends a message using a callback function:
void error_cb(tts_h tts, int utt_id, tts_error_e reason, void* user_data) { // Your code } void set_error_cb(tts_h tts) { int ret; ret = tts_set_error_cb(tts, error_cb, NULL); if (TTS_ERROR_NONE != ret) { // Error handling } } void unset_error_cb(tts_h tts) { int ret; ret = tts_unset_error_cb(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
- State changed
Getting Information
To get information of the current TTS state and the voices used:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application:
#include <tts.h>
-
You can obtain the current state, the supported voice list, and the current voice:
- Get the current state.
The TTS state is changed by other functions, and it is also applied as a precondition of each API. You can get the current state using the tts_get_state() function.
void get_state(tts_h tts) { tts_state_e current_state; int ret; ret = tts_get_state(tts, ¤t_state); if (TTS_ERROR_NONE != ret) { // Error handling } }
- Obtain a list of voices supported by the TTS using the tts_foreach_supported_voices() function. The tts_supported_voice_cb callback is invoked repeatedly for each supported voice. You can continue or stop getting the supported voice by the return value of the callback function.
bool supported_voice_cb(tts_h tts, const char* language, int voice_type, void* user_data) { return true; // Get next supported language return false; // Stop } void get_supported_voice(tts_h tts) { int ret; ret = tts_foreach_supported_voices(tts, supported_language_cb, NULL); if (TTS_ERROR_NONE != ret) { // Error handling } }
- Get the default voice using the tts_get_default_voice() function. If you do not set the language and the voice type as parameters of the tts_add_text() function, the TTS synthesizes the text using the default voice. You can get notified about the default voice changing:
void get_default_voice(tts_h tts) { int ret; char* default_lang = NULL; int default_voice_type; ret = tts_get_default_voice(tts, &default_lang, &default_voice_type); if (TTS_ERROR_NONE != ret) { // Error handling } }
- Get the current state.
Getting and Setting the Mode
To get and set the mode:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application:
#include <tts.h>
-
There are 3 different TTS modes available. The main difference is audio mixing with other sources. The default mode is TTS_MODE_DEFAULT, used for normal applications such as eBooks. If you set this mode and play your text, it can be interrupted when other sounds, such as ringtone or other TTS sounds, are played. Use these functions in the TTS_STATE_CREATED state.
Note The TTS_MODE_NOTIFICATION and TTS_MODE_SCREEN_READER modes are mixed with other sound sources, but they are used only for platform-specific features. Do not use them for normal applications. void set_mode(tts_h tts) { int ret; tts_mode_e mode = TTS_MODE_DEFAULT; ret = tts_set_mode(tts, mode); if (TTS_ERROR_NONE != ret) { // Error handling } } void get_mode(tts_h tts) { int ret; tts_mode_e mode; ret = tts_get_mode(tts, &mode); if (TTS_ERROR_NONE != ret) { // Error handling } }
Connecting and Disconnecting TTS
To operate the TTS:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application:
#include <tts.h>
-
After you create the TTS handle, connect the background TTS daemon. The daemon synthesizes the text with the engine and plays the resulting sound data:
-
The tts_prepare() function is asynchronous, and the state of the TTS is changed to TTS_STATE_READY.
void prepare_for_tts(tts_h tts) { int ret; ret = tts_prepare(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
Note If you can get the error callback after using the tts_prepare() function, the TTS is not available. -
The tts_unprepare() function is used for disconnection, and the state is changed back to TTS_STATE_CREATED.
void unprepared_for_tts(tts_h tts) { int ret; ret = tts_unprepare(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
-
Adding Text
To add text:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application. There are defined values, TTS_VOICE_TYPE_AUTO and TTS_VOICE_SPEED_AUTO for following the default TTS setting. The minimum and maximum limits for the speed are also defined in the header file.
#include <tts.h>
- To manage text:
-
You can request the TTS library to read your own text using the tts_add_text() function. The TTS library manages added text using queues, so it is possible to add several texts simultaneously. Each obtained text receives an utterance ID, which is used for synthesizing and playing the sound data.
Note If the added text is too long, some engines need a long time for synthesis. It is recommended to add a proper length text clip. When you do not set the language and use NULL for the language, the default language is used for synthesizing text.
You can add text at any point after the tts_prepare() function changes the state to TTS_STATE_READY.
void add_text(tts_h tts) { const char* text = "tutorial"; // Text for read const char* language = "en_US"; // Language int voice_type = TTS_VOICE_TYPE_FEMALE; // Voice type int speed = TTS_SPEED_AUTO; // Read speed int utt_id; // Utterance ID for the requested text int ret; ret = tts_add_text(tts, text, language, voice_type, speed, &utt_id); if (TTS_ERROR_NONE != ret) { // Error handling } }
-
There is a length limit for the added text in the engine. You can retrieve the maximum value using the tts_get_max_text_size() function in the TTS_STATE_READY state.
void get_maximum_text_size(tts_h tts) { int ret; int size; ret = tts_get_max_text_size(tts, &size); if (TTS_ERROR_NONE != ret) { // Error handling } }
-
Starting, Stopping, and Pausing Playback
To start, pause, and stop playback:
- To use the features of the TTS (text-to-speech) API (in mobile and wearable applications), include the <tts.h> header file in your application:
#include <tts.h>
-
To start synthesizing the text added in the queue and play the resulting sound data in sequence, use the tts_play() function. The state is changed to TTS_STATE_PLAYING. The playback continues until you call the tts_stop() or the tts_pause() function.
If there is no text in the queue, the TTS waits for text to be added in the TTS_STATE_PLAYING state. In that case, when you add text, the TTS starts synthesizing and playing it immediately. The TTS state need not change to TTS_STATE_READY state before using the tts_stop() function.
Note If you get the TTS state changed callback in the TTS_STATE_PLAYING without the TTS API call, prepare the TTS state. The TTS state can change if other applications request TTS play, the audio session requests TTS pause, or the TTS engine changes. void start(tts_h tts) { int ret; ret = tts_play(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
If you want to pause the process, use the tts_pause() function to change the state to TTS_STATE_PAUSED. You can resume playback using the tts_play() function.
void pause(tts_h tts) { int ret; ret = tts_pause(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }
To stop the playback, use the tts_stop() function. All the texts in the queue are removed. The state is changed to TTS_STATE_READY.
void stop(tts_h tts) { int ret; ret = tts_stop(tts); if (TTS_ERROR_NONE != ret) { // Error handling } }