text_preprocessing#

besser.agent.nlp.preprocessing.text_preprocessing.lemmatize_lux_text(text)[source]#
besser.agent.nlp.preprocessing.text_preprocessing.process_text(text, nlp_engine)[source]#
besser.agent.nlp.preprocessing.text_preprocessing.stem_text(text, language)[source]#
besser.agent.nlp.preprocessing.text_preprocessing.tokenize(text, language='en')[source]#

Tokenize a text (i.e., split into tokens)

Parameters:
  • text (str) – the text to tokenize

  • language (str) – the text language (defaults to english)

Returns:

list of tokens

Return type:

list(str)