hf_speech2text#

class baf.nlp.speech2text.hf_speech2text.HFSpeech2Text(agent, model_name, load_from_pytorch=False, language=None)[source]#

A Hugging Face Speech2Text.

It loads a Speech2Text Hugging Face model to perform the Speech2Text task.

Parameters:

agent (Agent) – the agent instance using this Speech2Text component
model_name (str) – the Hugging Face model name to load
load_from_pytorch (bool, optional, defaults to False) – Load the model weights from a PyTorch checkpoint save file
language (str, optional) – the language to use for transcription

_from_pt#

Load the model weights from a PyTorch checkpoint save file

_model_name#

the Hugging Face model name

_sampling_rate#

the sampling rate of audio data, it must coincide with the sampling rate used to train the model

_forced_decoder_ids#

the decoder ids

speech2text(speech)[source]#

Transcribe a voice audio into its corresponding text representation.

Parameters:: speech (bytes) – the recorded voice that wants to be transcribed
Returns:: the speech transcription
Return type:: str