Speech-to-Text#
BAF allows you to use your voice to interact with the agents, transforming them into voicebots! To do this, it implements a Speech-to-Text component (also known as automatic speech recognition, STT or S2T). It solves the NLP task of transcribing an audio file. Then, the transcription is treated as a typical user text message.
Currently, BAF has 2 different implementations for speech-to-text:
With HuggingFace models (only tested with openai/whisper models). You need to set the
NLP_STT_HF_MODEL
agent property. Example model:openai/whisper-tiny
(very lightweight model)With the SpeechRecognition Python library. You need to set the
NLP_STT_SR_ENGINE
agent property.