text_preprocessing#

baf.nlp.preprocessing.text_preprocessing.lemmatize_lux_text(text)[source]#

baf.nlp.preprocessing.text_preprocessing.process_text(text, nlp_engine)[source]#

baf.nlp.preprocessing.text_preprocessing.stem_text(text, language)[source]#

baf.nlp.preprocessing.text_preprocessing.tokenize(text, language='en')[source]#

Tokenize a text (i.e., split into tokens)

Parameters:

Returns:

list of tokens

Return type:

list(str)