LivingDataLab - Chat with Your Data using Memory and Langchain

1 Introduction

In this article we are going to give a chatbot memory to help it better ask questions about data using langchain.

Recall the overall workflow for retrieval augmented generation (RAG):

2 Load Libs and Setup

import os
import openai
import sys
sys.path.append('../..')

import panel as pn  # GUI
pn.extension()

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

The code below was added to assign the openai LLM version until it is deprecated, currently in Sept 2023. LLM responses can often vary, but the responses may be significantly different when using a different model version.

import datetime
current_date = datetime.datetime.now().date()
if current_date < datetime.date(2023, 9, 2):
    llm_name = "gpt-3.5-turbo-0301"
else:
    llm_name = "gpt-3.5-turbo"
print(llm_name)

gpt-3.5-turbo-0301

If you wish to experiment on LangChain plus platform:

Go to langchain plus platform and sign up
Create an api key from your account’s settings
Use this api key in the code below

#import os
#os.environ["LANGCHAIN_TRACING_V2"] = "true"
#os.environ["LANGCHAIN_ENDPOINT"] = "https://api.langchain.plus"
#os.environ["LANGCHAIN_API_KEY"] = "..."

So, initially, we load our vector store, which has all of the embeddings for all of the course materials. Using the vector store, we can perform a simple similarity search. The language model that will serve as the foundation for our chatbot can be initialised. We can set up a retrieval quality assurance chain, initialise a prompt template, and then submit a query to receive an answer.

from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
persist_directory = 'docs/chroma/'
embedding = OpenAIEmbeddings()
vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)

question = "What are major topics for this class?"
docs = vectordb.similarity_search(question,k=3)
len(docs)

from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name=llm_name, temperature=0)
llm.predict("Hello world!")

'Hello there! How can I assist you today?'

# Build prompt
from langchain.prompts import PromptTemplate
template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Use three sentences maximum. Keep the answer as concise as possible. Always say "thanks for asking!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"],template=template,)

# Run chain
from langchain.chains import RetrievalQA
question = "Is probability a class topic?"
qa_chain = RetrievalQA.from_chain_type(llm,
                                       retriever=vectordb.as_retriever(),
                                       return_source_documents=True,
                                       chain_type_kwargs={"prompt": QA_CHAIN_PROMPT})


result = qa_chain({"query": question})
result["result"]

'Yes, probability is assumed to be a prerequisite for the class. Thanks for asking!'

3 Memory

Let’s give it a little extra memory. So a conversation buffer memory will be used by us. The result of this is that every time, the chatbot will receive the inquiry along with a list of previous chat messages that are kept in a buffer.

In particular, the history of chats will be specified. Just the alignment of an input variable on the prompt will be accomplished by doing this. Finally, we’ll define that return messages must be true. Instead of returning the chat history as a single string, this will return the history of the chat as a list of messages. The most basic memory type is this one.

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

4 Conversational Retrieval Chain

Now let’s construct a conversational retrieval chain, a novel kind of chain. We pass in memory along with the language model, retriever, and other components. In addition to memory, the conversational retrieval chain also adds a new component to the retrieval QA chain. It specifically includes a step that merges the history and the new query into a single question that is sent to the vector store for use in searching for relevant documents. Let’s give it a try. Without any prior context, let’s see what we get as a response. After then, we can inquire further about the response.

from langchain.chains import ConversationalRetrievalChain
retriever=vectordb.as_retriever()
qa = ConversationalRetrievalChain.from_llm(
    llm,
    retriever=retriever,
    memory=memory
)

question = "Is probability a class topic?"
result = qa({"question": question})

result['answer']

'Yes, probability is a topic in this class and the instructor assumes familiarity with basic probability and statistics.'

So, we inquire: Is probability a subject covered in class? We receive a response. The teacher takes for granted that the students have a fundamental grasp of statistics and probability. Then, we inquire as to why such conditions are necessary. Let’s look at the outcome we receive now. When we receive a response, we can now see that it does not, as previously, mistake computer science with probability and statistics, but rather refers to them as prerequisites and builds upon them. Let’s examine the inside workings of the user interface. So, it’s clear that there is a little more intricacy in this situation.

question = "why are those prerequesites needed?"
result = qa({"question": question})

result['answer']

'The reason for requiring familiarity with basic probability and statistics in this class is because the class assumes that students already know what random variables are, what expectation is, what a variance or a random variable is. The class will also use probability and statistics concepts throughout the course.'

We can see that the chat history has been added to the chain’s input along with the inquiry as ‘memory’. Before the chain is invoked and recorded in this logging system, chat history from memory is applied. The trace shows that there are two distinct processes taking place. An LLM is contacted initially, and then the stuff papers chain is contacted. Let’s examine the initial call. Here, a popup with some instructions is seen. Rephrase the follow-up question so that it stands alone in the context of the subsequent conversation. We have the earlier history right here.

So, the answer to the initial query, “Is probability a class topic,” is yes. The assistance responses are then available. We then have the stand-alone question over here. Why are elementary statistics and probability requirements for the class required? This standalone response is then sent to the document retriever, which returns four documents, three documents, or as many we choose. We then send those documents along to the chain of supporting documents and attempt to respond to the initial query. Therefore, if we investigate that, we can observe that the system has provided the following background to address the user’s inquiry.

We’ve got a bunch of context. And then we have the stand-alone question down below. And then we get an answer. And here’s the answer that is relevant for the question at hand, which is about probability and statistics as prerequisites.

5 Create a chatbot that works on your documents

So we’ll load a database and a chain of retrievers. A file will be passed in. We’re going to use the PDF loader to load it. After that, we’ll load it into docs. These documents will be divided. We’ll produce some embeddings and store the data in a vector storage.

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA,  ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader

The vector store will subsequently be transformed into a retriever. With some “search_kwargs=k” that we’re going to set equal to a parameter that we may give in, we’re going to use similarity in this situation. Following that, we’ll build the conversational retrieval chain.

def load_db(file, chain_type, k):
    # load documents
    loader = PyPDFLoader(file)
    documents = loader.load()
    # split documents
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
    docs = text_splitter.split_documents(documents)
    # define embedding
    embeddings = OpenAIEmbeddings()
    # create vector database from data
    db = DocArrayInMemorySearch.from_documents(docs, embeddings)
    # define retriever
    retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": k})
    # create a chatbot chain. Memory is managed externally.
    qa = ConversationalRetrievalChain.from_llm(
        llm=ChatOpenAI(model_name=llm_name, temperature=0), 
        chain_type=chain_type, 
        retriever=retriever, 
        return_source_documents=True,
        return_generated_question=True,
    )
    return qa

import panel as pn
import param

class cbfs(param.Parameterized):
    chat_history = param.List([])
    answer = param.String("")
    db_query  = param.String("")
    db_response = param.List([])
    
    def __init__(self,  **params):
        super(cbfs, self).__init__( **params)
        self.panels = []
        self.loaded_file = "docs/MachineLearning-Lecture01.pdf"
        self.qa = load_db(self.loaded_file,"stuff", 4)
    
    def call_load_db(self, count):
        if count == 0 or file_input.value is None:  # init or no file specified :
            return pn.pane.Markdown(f"Loaded File: {self.loaded_file}")
        else:
            file_input.save("temp.pdf")  # local copy
            self.loaded_file = file_input.filename
            button_load.button_style="outline"
            self.qa = load_db("temp.pdf", "stuff", 4)
            button_load.button_style="solid"
        self.clr_history()
        return pn.pane.Markdown(f"Loaded File: {self.loaded_file}")

    def convchain(self, query):
        if not query:
            return pn.WidgetBox(pn.Row('User:', pn.pane.Markdown("", width=600)), scroll=True)
        result = self.qa({"question": query, "chat_history": self.chat_history})
        self.chat_history.extend([(query, result["answer"])])
        self.db_query = result["generated_question"]
        self.db_response = result["source_documents"]
        self.answer = result['answer'] 
        self.panels.extend([
            pn.Row('User:', pn.pane.Markdown(query, width=600)),
            pn.Row('ChatBot:', pn.pane.Markdown(self.answer, width=600, style={'background-color': '#F6F6F6'}))
        ])
        inp.value = ''  #clears loading indicator when cleared
        return pn.WidgetBox(*self.panels,scroll=True)

    @param.depends('db_query ', )
    def get_lquest(self):
        if not self.db_query :
            return pn.Column(
                pn.Row(pn.pane.Markdown(f"Last question to DB:", styles={'background-color': '#F6F6F6'})),
                pn.Row(pn.pane.Str("no DB accesses so far"))
            )
        return pn.Column(
            pn.Row(pn.pane.Markdown(f"DB query:", styles={'background-color': '#F6F6F6'})),
            pn.pane.Str(self.db_query )
        )

    @param.depends('db_response', )
    def get_sources(self):
        if not self.db_response:
            return 
        rlist=[pn.Row(pn.pane.Markdown(f"Result of DB lookup:", styles={'background-color': '#F6F6F6'}))]
        for doc in self.db_response:
            rlist.append(pn.Row(pn.pane.Str(doc)))
        return pn.WidgetBox(*rlist, width=600, scroll=True)

    @param.depends('convchain', 'clr_history') 
    def get_chats(self):
        if not self.chat_history:
            return pn.WidgetBox(pn.Row(pn.pane.Str("No History Yet")), width=600, scroll=True)
        rlist=[pn.Row(pn.pane.Markdown(f"Current Chat History variable", styles={'background-color': '#F6F6F6'}))]
        for exchange in self.chat_history:
            rlist.append(pn.Row(pn.pane.Str(exchange)))
        return pn.WidgetBox(*rlist, width=600, scroll=True)

    def clr_history(self,count=0):
        self.chat_history = []
        return

6 Create a chatbot

We’re not passing in memory, which is a key distinction to make here. In order to make the GUI below more convenient, we’ll manage RAM outside. Therefore, it will be necessary to manage conversation history independently of the chain. There is a tonne more code here after that. We won’t dwell on it for too long, but it is worth noting that we are sending chat history into the chain at this point. Again, this is due to the lack of memory that is associated with it. The chat history is now being extended as a result. Then, after putting everything together and running it, we will have a great UI that will allow us to communicate with our chatbot.

cb = cbfs()

file_input = pn.widgets.FileInput(accept='.pdf')
button_load = pn.widgets.Button(name="Load DB", button_type='primary')
button_clearhistory = pn.widgets.Button(name="Clear History", button_type='warning')
button_clearhistory.on_click(cb.clr_history)
inp = pn.widgets.TextInput( placeholder='Enter text here…')

bound_button_load = pn.bind(cb.call_load_db, button_load.param.clicks)
conversation = pn.bind(cb.convchain, inp) 

#jpg_pane = pn.pane.Image( './img/convchain.jpg')

tab1 = pn.Column(
    pn.Row(inp),
    pn.layout.Divider(),
    pn.panel(conversation,  loading_indicator=True, height=300),
    pn.layout.Divider(),
)
tab2= pn.Column(
    pn.panel(cb.get_lquest),
    pn.layout.Divider(),
    pn.panel(cb.get_sources ),
)
tab3= pn.Column(
    pn.panel(cb.get_chats),
    pn.layout.Divider(),
)
tab4=pn.Column(
    pn.Row( file_input, button_load, bound_button_load),
    pn.Row( button_clearhistory, pn.pane.Markdown("Clears chat history. Can use to start a new topic" )),
    pn.layout.Divider(),
    #pn.Row(jpg_pane.clone(width=400))
)
dashboard = pn.Column(
    pn.Row(pn.pane.Markdown('# ChatWithYourData_Bot')),
    pn.Tabs(('Conversation', tab1), ('Database', tab2), ('Chat History', tab3),('Configure', tab4))
)
dashboard

WARNING:param.Markdown00193: Setting non-parameter attribute styles={'background-color': '#F6F6F6'} using a mechanism intended only for parameters

You can try alternate memory and retriever models by changing the configuration in load_db function and the convchain method. Panel and Param have many useful features and widgets you can use to extend the GUI.

7 Acknowledgements

I’d like to express my thanks to the wonderful LangChain: Chat with your data course by DeepLearning.ai and LangChain - which i completed, and acknowledge the use of some images and other materials from the course in this article.

Chat with Your Data using Memory and Langchain

Subscribe