Multiple Document QA Chain with Sources

This code provides a question-answering (QA) system using Langchain, which allows you to chat with multiple documents (PDF, TXT, etc.) as sources. The code also includes support for creating a simple Streamlit app for a user-friendly interface.

Note: The conversation shown in the screenshot was based on a PDF by UCSD International Student Office. The information displayed in the chat should not be taken as truth; it is for demonstration purposes only.

Unique Features

This repository includes the following unique features:

Persistent Database: The code provides an option for the database to persist between sessions. By specifying a directory for persist_directory when creating the database, you can avoid recreating the index each time the code runs. To create a persistent database, use the following code:
```
vectordb = Chroma.from_documents(documents, embedding=embedding, persist_directory='db')
```
Customizable Prompts: The code clearly demonstrates how the prompts are sent to the language model (LLM) under the hood. This allows you to easily understand and modify the prompts to tailor the responses for your specific use case. You can explore and adjust the prompts in the following code section:
```
# Print the chat prompts
print(qa_chain.combine_documents_chain.llm_chain.prompt.messages[0].prompt.template)
print(qa_chain.combine_documents_chain.llm_chain.prompt.messages[1].prompt.template)
```

Code description

Usage

Load and process the documents:
- If you have text files, use the TextLoader class. Update the file path in the DirectoryLoader constructor to the directory containing your text files.
```
loader = DirectoryLoader('/path/to/text/files/', glob="./*.txt", loader_cls=TextLoader)
```
- If you have PDF files, use the PyPDFLoader class. Update the file path in the DirectoryLoader constructor to the directory containing your PDF files.
```
loader = DirectoryLoader('/path/to/pdf/files/', glob="./*.pdf", loader_cls=PyPDFLoader)
```

Create the document database:

To create a new database each time:

embedding = OpenAIEmbeddings()

vectordb = Chroma.from_documents(documents, embedding=embedding, persist_directory=None)

To create a database that persists between sessions:

embedding = OpenAIEmbeddings()

vectordb = Chroma.from_documents(documents, embedding=embedding, persist_directory='db')

Create the question-answering chain:

turbo_llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
retriever = vectordb.as_retriever(search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(llm=turbo_llm, chain_type="stuff", retriever=retriever, return_source_documents=True)

Use the chat prompts to interact with the QA system:

# Print the chat prompts
print(qa_chain.combine_documents_chain.llm_chain.prompt.messages[0].prompt.template)
print(qa_chain.combine_documents_chain.llm_chain.prompt.messages[1].prompt.template)

# Main loop for user input
while True:
    query = input("Enter your query (or 'q' to quit): ")
    if query == 'q':
        break
    llm_response = qa_chain(query)
    process_llm_response(llm_response)

Run the code:
```
python your_script.py
```

Streamlit App

The code also includes a simple Streamlit app for a more interactive experience. To run the Streamlit app, follow these steps:

Uncomment the necessary code lines in the provided script.
Run the Streamlit app:
```
streamlit run app.py
```
Access the app in your browser by clicking on the external URL provided.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Multiple_Document_QA_Chain_with_Sources.ipynb		Multiple_Document_QA_Chain_with_Sources.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multiple Document QA Chain with Sources

Unique Features

Code description

Usage

Streamlit App

About

Releases

Packages

Languages

jmayank23/Multiple-Document-QAChain-withSources

Folders and files

Latest commit

History

Repository files navigation

Multiple Document QA Chain with Sources

Unique Features

Code description

Usage

Streamlit App

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages