rag#

class baf.nlp.rag.rag.RAG(agent, vector_store, splitter, llm_name, llm_prompt=None, k=4, num_previous_messages=0, session_scoped=False)[source]#

Bases: object

A Retrieval Augmented Generation (RAG) implementation.

A vector stores contains vectorized representations (i.e., embeddings) of chunks of data (text). For a given input query, a retriever gets the k most similar stored embeddings to the input embedding.

This is usually used to, given a query or question, retrieve those chunks that could be helpful to give it an answer. Then, an LLM is in charge of generating that answer, given the original query and the retrieved data as context.

Parameters:

agent (Agent) – the agent the RAG engine belongs to
vector_store (langchain_core.vectorstores.base.VectorStore | Callable[[], VectorStore]) – the vector store of the RAG engine, , or a zero-arg factory that produces one. A factory is recommended when session_scoped=True so that each session’s RAG receives its own isolated store
splitter (langchain_text_splitters.base.TextSplitter) – the text splitter of the RAG engine
llm_name (str) – the name of the LLM of the RAG engine. It must have been previously created and assigned to the agent
llm_prompt (str) – the prompt containing the detailed instructions for the answer generation by the LLM. If none is provided, the default prompt will be used
k (int) – number of chunks to retrieve from the vector store
num_previous_messages (int) – number of previous messages of the conversation to add to the LLM prompt context. Necessary a connection to MonitoringDB.
session_scoped (bool) – True if each session should have its own RAG engine and vector store.

_nlp_engine#

the NLPEngine that handles the NLP processes of the agent the RAG engine belongs to

Type:: NLPEngine

vector_store#

the vector store of the RAG engine

Type:: langchain_core.vectorstores.base.VectorStore

splitter#

the text splitter of the RAG engine

Type:: langchain_text_splitters.base.TextSplitter

llm_name#

the name of the LLM of the RAG engine. It must have been previously created and assigned to the agent

Type:: str

llm_prompt#

the prompt containing the detailed instructions for the answer generation by the LLM. If none is provided, the default prompt will be used

Type:: str

k#

number of chunks to retrieve from the vector store

Type:: int

num_previous_messages#

number of previous messages of the conversation to add to the LLM prompt context. Necessary a connection to MonitoringDB.

Type:: int

DEFAULT_LLM_PROMPT = "You are an assistant for question-answering tasks. Based on the previous messages in the conversation (if provided), and additional context retrieved from a database (if provided), answer the user question. If you don't know the answer, just say that you don't know. Note that if the question refers to a previous message, you may have to ignore the context since it is retrieved from the database based only on the question (the retrieval does not take into account the previous messages). Use three sentences maximum and keep the answer concise"#

SUPPORTED_FORMATS = ('pdf', 'docx', 'txt', 'md')#

add_file(file)[source]#

Extract text from a BAF File object and add it to the RAG’s vector store at runtime.

Parameters:: file (File) – the BAF File object to load
Returns:: the number of chunks added to the vector store
Return type:: int
Raises:: ValueError – if the file type is not in RAG.SUPPORTED_FORMATS`

add_text(text, metadata=None)[source]#

Chunk a raw text string and add it to the RAG’s vector store at runtime.

Parameters:

text (str) – the raw text to add
metadata (dict) – optional metadata to attach to each chunk

Returns:

the number of chunks added to the vector store

Return type:

int

clear()[source]#: Remove all documents from the vector store.

create_prompt(history, docs, question, llm_prompt=None)[source]#

Creates the prompt for the LLM answer generation.

Parameters:

history (list[Message]) – the chat history
docs (list[langchain_core.documents.base.Document]) – the retrieved documents to use as context in the prompt
question (str) – the user question
llm_prompt (str) – the prompt containing the detailed instructions for the answer generation by the LLM. If none is provided, the RAG’s default value will be used

Returns:

the LLM prompt

Return type:

str

is_empty()[source]#

Return True if the vector store contains no documents.

Returns:: True if the corpus is empty
Return type:: bool

load_documents(path, formats=None)[source]#

Load documents from a directory into the RAG’s vector store.

Supports multiple file formats. Use load_pdfs() if only PDF loading is needed.

Parameters:

path (str) – path to the directory containing the files
formats (list[str]) – restrict loading to a subset of formats, e.g. ['pdf', 'docx']. If None, all RAG.SUPPORTED_FORMATS are loaded.

Returns:

the number of chunks added to the vector store

Return type:

int

load_documents_from_path(fmt)[source]#

Load raw (un-chunked) LangChain Documents from a file path.

Dispatches to the appropriate loader based on fmt. Used internally by RAG.load_documents() and RAG.add_file().

Parameters:

path (str) – path to the file on disk
fmt (str) – lowercase file format

Returns:

raw LangChain Documents (not yet chunked)

Return type:

list[Document]

Raises:

ValueError – if fmt is not in RAG.SUPPORTED_FORMATS
ImportError – if fmt is 'docx' and python-docx is not installed

load_pdfs(path)[source]#

Load PDF files from a given location into the RAG’s vector store.

Parameters:: path (str) – the path where the files are located
Returns:: the number of chunks added to the vector store
Return type:: int

run(message, session=None, llm_prompt=None, llm_name=None, k=None, num_previous_messages=None)[source]#

Run the RAG engine.

Parameters:

session (Session) – the session of the user that started this request. Must be provided if the chat history wants to be added as context to the LLM prompt.
message (str) – the message to be used as RAG query
llm_prompt (str) – the prompt containing the detailed instructions for the answer generation by the LLM. If none is provided, the RAG’s default value will be used
llm_name (str) – the name of the LLM to use. If none is provided, the RAG’s default value will be used
k (int) – the number of (top) documents to get. If none is provided, the RAG’s default value will be used
num_previous_messages (int) – number of previous messages of the conversation to add to the LLM prompt context. If none is provided, the RAG’s default value will be used. Necessary a connection to MonitoringDB.

Returns:

the resulting RAG message

Return type:

RAGMessage

run_retrieval(question, k=None)[source]#

Run retrieval. Given a query, return the k most relevant documents (i.e., chunks) from the RAG’s vector store.

Parameters:

question (str) – the input query
k (int) – the number of (top) documents to get. If none is provided, the RAG’s default value will be used

Returns:

the retrieved documents

Return type:

list[langchain_core.documents.base.Document]

class baf.nlp.rag.rag.RAGMessage(llm_name, question, answer, docs)[source]#

Bases: object

The result of a RAG execution.

Parameters:

llm_name (str) – the name of the LLM that generated the answer
question (str) – the original question
answer (str) – the generated answer
docs (list[langchain_core.documents.base.Document]) – the list of documents that were used as additional context for the answer generation

llm_name#

the name of the LLM that generated the answer

Type:: str

question#

the original question

Type:: str

answer#

the generated answer

Type:: str

docs#

the list of documents that were used as additional context for the answer generation

Type:: list[langchain_core.documents.base.Document]

to_dict()[source]#: Returns a dictionary containing the attributes of the RAGMessage object.