text2speech#
- class baf.nlp.text2speech.text2speech.Text2Speech(agent, language=None)[source]#
Bases:
ABCThe Text2Speech abstract class.
The Text2Speech component, also known as TTS or speech synthesis, is in charge of converting written text into audio speech signals. This task is called synthesizing or speech synthesis.
We can use it in an agent to allow the users to send text messages and synthesize them to audio speech signals like regular spoken language
- Parameters:
- _nlp_engine#
The NLPEngine that handles the NLP processes of the agent
- _abc_impl = <_abc._abc_data object>#
- abstract text2speech(text)[source]#
Synthesize a text into its corresponding audio speech signal.
- Parameters:
text (str) – the text that wants to be synthesized
- Returns:
- the speech synthesis as a dictionary containing 2 keys:
- audio (np.ndarray): the generated audio waveform as a numpy array with dimensions (nb_channels, audio_length),
where nb_channels is the number of audio channels (usually 1 for mono) and audio_length is the number of samples in the audio
- sampling_rate (int): an integer value containing the sampling rate, e.g. how many samples correspond to
one second of audio
- Return type: