text2speech#

class baf.nlp.text2speech.text2speech.Text2Speech(agent, language=None)[source]#

Bases: ABC

The Text2Speech abstract class.

The Text2Speech component, also known as TTS or speech synthesis, is in charge of converting written text into audio speech signals. This task is called synthesizing or speech synthesis.

We can use it in an agent to allow the users to send text messages and synthesize them to audio speech signals like regular spoken language

Parameters:
  • agent (Agent) – The Agent the Text2Speech system belongs to

  • language (str) – The user language for the Text2Speech system

_nlp_engine#

The NLPEngine that handles the NLP processes of the agent

_abc_impl = <_abc._abc_data object>#
abstract text2speech(text)[source]#

Synthesize a text into its corresponding audio speech signal.

Parameters:

text (str) – the text that wants to be synthesized

Returns:

the speech synthesis as a dictionary containing 2 keys:
audio (np.ndarray): the generated audio waveform as a numpy array with dimensions (nb_channels, audio_length),

where nb_channels is the number of audio channels (usually 1 for mono) and audio_length is the number of samples in the audio

sampling_rate (int): an integer value containing the sampling rate, e.g. how many samples correspond to

one second of audio

Return type:

dict