Skip to content

Commit

Permalink
🩹 broken link fix (lancedb#186)
Browse files Browse the repository at this point in the history
  • Loading branch information
PrashantDixit0 authored May 13, 2024
1 parent 7bdef20 commit bbc96f0
Show file tree
Hide file tree
Showing 6 changed files with 50 additions and 36 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ If you're looking for in-depth tutorial-like examples, checkout the [tutorials](
| [Evaluating Prompts with Prompttools](/examples/prompttools-eval-prompts/) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/prompttools-eval-prompts/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![advanced](https://img.shields.io/badge/advanced-FF3333)](#)| |
| [AI Agents: Reducing Hallucination](/examples/reducing_hallucinations_ai_agents/) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/reducing_hallucinations_ai_agents/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](./examples/reducing_hallucinations_ai_agents/main.py) [![JS](https://img.shields.io/badge/javascript-%23323330.svg?style=for-the-badge&logo=javascript&logoColor=%23F7DF1E)](./examples/reducing_hallucinations_ai_agents/index.js) [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![advanced](https://img.shields.io/badge/advanced-FF3333)](#) |[![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/how-to-reduce-hallucinations-from-llm-powered-agents-using-long-term-memory-72f262c3cc1f/)|
| [AI Trends Searcher with CrewAI](./examples/AI-Trends-with-CrewAI/) |<a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/AI-Trends-with-CrewAI/CrewAI_AI_Trends.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)|[![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/track-ai-trends-crewai-agents-rag/)|
| [SuperAgent Autogen](/examples/SuperAgent_Autogen) |<a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/SuperAgent_Autogen/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)|[![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/optimizing-ai-agents-harnessing-openai-compatible-technologies-and-vector-databases)|
| [SuperAgent Autogen](/examples/SuperAgent_Autogen) |<a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/SuperAgent_Autogen/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)||
[Sentiment Analysis : Analysing Hotel Reviews](/examples/Sentiment-Analysis-Analyse-Hotel-Reviews/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/Sentiment-Analysis-Analyse-Hotel-Reviews/Sentiment_Analysis_using_LanceDB.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/sentiment-analysis-using-lancedb-2da3cb1e3fa6)|
| [Facial Recognition](./examples/facial_recognition) | <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/facial_recognition/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)|
| [Imagebind demo app](/examples/imagebind_demo/) | <a href="https://huggingface.co/spaces/raghavd99/imagebind2"><img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo-with-title.svg" alt="hf spaces" style="width: 80px; vertical-align: middle; background-color: white;"></a> [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)|
Expand Down Expand Up @@ -101,14 +101,14 @@ Looking to get started with LLMs, vectorDBs, and the world of Generative AI? The
| [Local RAG from Scratch with Llama3](./tutorials/Local-RAG-from-Scratch) | [![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)](./tutorials/Local-RAG-from-Scratch/rag.py) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| |
| [A Primer on Text Chunking and its Types](./tutorials/different-types-text-chunking-in-RAG) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/different-types-text-chunking-in-RAG/Text_Chunking_on_RAG_application_with_LanceDB.ipynb) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/a-primer-on-text-chunking-and-its-types-a420efc96a13) |
| [Langchain LlamaIndex Chunking](./tutorials/Langchain-LlamaIndex-Chunking) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/Langchain-LlamaIndex-Chunking/Langchain_Llamaindex_chunking.ipynb) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/chunking-techniques-with-langchain-and-llamaindex/) |
| [Comparing Cohere Rerankers with LanceDB](./tutorials/cohere-reranker) | [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)]() |
| [Comparing Cohere Rerankers with LanceDB](./tutorials/cohere-reranker) | [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/benchmarking-cohere-reranker-with-lancedb/) |
| [NER powered Semantic Search](./tutorials/NER-powered-Semantic-Search) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/NER-powered-Semantic-Search/NER_powered_Semantic_Search_with_LanceDB.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![beginner](https://img.shields.io/badge/beginner-B5FF33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/ner-powered-semantic-search-using-lancedb-51051dc3e493) |
| [Product Quantization: Compress High Dimensional Vectors](https://blog.lancedb.com/product-quantization-compress-high-dimensional-vectors-dfcba98fab47) |[![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#) | [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/product-quantization-compress-high-dimensional-vectors-dfcba98fab47) |
| [Product Quantization: Compress High Dimensional Vectors](https://blog.lancedb.com/benchmarking-lancedb-92b01032874a-2/) |[![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#) | [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/benchmarking-lancedb-92b01032874a-2/) |
| [Corrective RAG with Langgraph](./tutorials/Corrective-RAG-with_Langgraph/) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/Corrective-RAG-with_Langgraph/CRAG_with_Langgraph.ipynb) [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/implementing-corrective-rag-in-the-easiest-way-2/)|
| [LLMs, RAG, & the missing storage layer for AI](https://blog.lancedb.com/llms-rag-the-missing-storage-layer-for-ai-28ded35fa984) | [![intermediate](https://img.shields.io/badge/intermediate-FFDA33)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/llms-rag-the-missing-storage-layer-for-ai-28ded35fa984/) |
| [Fine-Tuning LLM using PEFT & QLoRA](./tutorials/fine-tuning_LLM_with_PEFT_QLoRA) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/fine-tuning_LLM_with_PEFT_QLoRA/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![advanced](https://img.shields.io/badge/advanced-FF3333)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/optimizing-llms-a-step-by-step-guide-to-fine-tuning-with-peft-and-qlora-22eddd13d25b) |
| [Context-Aware Chatbot using Llama 2 & LanceDB](./tutorials/chatbot_using_Llama2_&_lanceDB) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/chatbot_using_Llama2_&_lanceDB/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![advanced](https://img.shields.io/badge/advanced-FF3333)](#)| [![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/context-aware-chatbot-using-llama-2-lancedb-as-vector-database-4d771d95c755) |
| [Better RAG with FLARE](./tutorials/better-rag-FLAIR) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/better-rag-FLAIR/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![advanced](https://img.shields.io/badge/advanced-FF3333)](#)|[![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://medium.com/@aksdesai1998/better-rag-enhancing-ai-with-active-retrieval-augmented-generation-flare-3b66646e2a9f) |
| [Better RAG with FLARE](./tutorials/better-rag-FLAIR) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/tutorials/better-rag-FLAIR/main.ipynb) [![local LLM](https://img.shields.io/badge/local-llm-green)](#) [![LLM](https://img.shields.io/badge/openai-api-white)](#) [![advanced](https://img.shields.io/badge/advanced-FF3333)](#)|[![Ghost](https://img.shields.io/badge/ghost-000?style=for-the-badge&logo=ghost&logoColor=%23F7DF1E)](https://blog.lancedb.com/better-rag-with-active-retrieval-augmented-generation-flare-3b66646e2a9f/) |



Expand Down
27 changes: 15 additions & 12 deletions applications/Healthcare_chatbot/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,26 +28,24 @@

app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
allow_methods=["*"],
allow_headers=["*"],
)


# Load the document
DATA_PATH = "data/"


loader = DirectoryLoader(DATA_PATH,
glob='*.pdf',
loader_cls=PyPDFLoader)
loader = DirectoryLoader(DATA_PATH, glob="*.pdf", loader_cls=PyPDFLoader)

docs = loader.load()
logging.info("Document loader done.")

# Set up the text processing and model chain
#llm = ChatOpenAI(model="gpt-4", temperature=0, openai_api_key=OPENAI_API_KEY)
# llm = ChatOpenAI(model="gpt-4", temperature=0, openai_api_key=OPENAI_API_KEY)

# download weights from https://huggingface.co/PrunaAI/OpenBioLLM-Llama3-8B-GGUF-smashed/tree/main
llm = LlamaCpp(
Expand All @@ -57,7 +55,9 @@
verbose=False, # Verbose is required to pass to the callback manager
)

embeddings_med = SentenceTransformerEmbeddings(model_name="NeuML/pubmedbert-base-embeddings")
embeddings_med = SentenceTransformerEmbeddings(
model_name="NeuML/pubmedbert-base-embeddings"
)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

logging.info("Embedding and LLM setup done.")
Expand All @@ -68,7 +68,9 @@
logging.info("Retriever setup done.")

compressor = CohereRerank(cohere_api_key=COHERE_API_KEY)
compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor, base_retriever=retriever
)
logging.info("Cohere compression retriever setup done.")

chain = RetrievalQA.from_chain_type(llm=llm, retriever=compression_retriever)
Expand All @@ -80,17 +82,18 @@
class QueryRequest(BaseModel):
query: str


@app.post("/query/", response_model=dict)
async def handle_query(request: QueryRequest):
try:
compressed_docs = compression_retriever.invoke(request.query)
# Assuming pretty_print_docs function returns a string
response = chain({"query": request.query})
print("response",response['result'])
return {"answer": response['result']}
print("response", response["result"])
return {"answer": response["result"]}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))


if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)

Binary file added assets/critique-based-contexting.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion examples/databricks_DBRX_website_bot/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def get_doc_from_url(url):
def build_RAG(
url="https://harrypotter.fandom.com/wiki/Hogwarts_School_of_Witchcraft_and_Wizardry",
embed_model="mixedbread-ai/mxbai-embed-large-v1",
uri="~/tmp/lancedb_hogwarts_12",
uri="~/tmp/lancedb_hogwart",
force_create_embeddings=False,
):
Settings.embed_model = HuggingFaceEmbedding(model_name=embed_model)
Expand Down
2 changes: 1 addition & 1 deletion examples/reducing_hallucinations_ai_agents/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ AI agents can help simplify and automate tedious workflows. By going through thi

Colab walkthrough - <a href="https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/reducing_hallucinations_ai_agents/main.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>

![Untitled (34)](https://github.com/lancedb/vectordb-recipes/assets/15766192/e87d5fcc-6f04-4592-b9ec-0156ee2c98df)
![alt text](../../assets/critique-based-contexting.png)


### Setup
Expand Down
47 changes: 29 additions & 18 deletions tutorials/cohere-reranker/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,16 @@ def evaluate(
query_type="auto",
verbose=False,
):
#corpus = dataset['corpus']
#queries = dataset['queries']
#relevant_docs = dataset['relevant_docs']
# corpus = dataset['corpus']
# queries = dataset['queries']
# relevant_docs = dataset['relevant_docs']

vector_store = LanceDBVectorStore(uri=f"/tmp/lancedb_cohere-bench-{time.time()}")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model)
index = VectorStoreIndex.from_documents(
docs,
service_context=service_context,
service_context=service_context,
show_progress=True,
storage_context=storage_context,
)
Expand All @@ -42,37 +42,48 @@ def evaluate(
eval_results = []
ds = dataset.to_pandas()
for idx in tqdm(range(len(ds))):
query = ds['query'][idx]
reference_context = ds['reference_contexts'][idx]
query = ds["query"][idx]
reference_context = ds["reference_contexts"][idx]
query_vector = embed_model.get_query_embedding(query)
try:
if reranker is None:
rs = tbl.search(query_vector).limit(top_k).to_pandas()
elif query_type == "auto":
rs = tbl.search((query_vector, query)).rerank(reranker=reranker).limit(top_k).to_pandas()
rs = (
tbl.search((query_vector, query))
.rerank(reranker=reranker)
.limit(top_k)
.to_pandas()
)
elif query_type == "vector":
rs = tbl.search(query_vector).rerank(reranker=reranker, query_string=query).limit(top_k*2).to_pandas() # Overfetch for vector only reranking
rs = (
tbl.search(query_vector)
.rerank(reranker=reranker, query_string=query)
.limit(top_k * 2)
.to_pandas()
) # Overfetch for vector only reranking
except Exception as e:
print(f'Error with query: {idx} {e}')
print(f"Error with query: {idx} {e}")
continue
retrieved_texts = rs['text'].tolist()[:top_k]
retrieved_texts = rs["text"].tolist()[:top_k]
expected_text = reference_context[0]
is_hit = expected_text in retrieved_texts # assume 1 relevant doc
eval_result = {
'is_hit': is_hit,
'retrieved': retrieved_texts,
'expected': expected_text,
'query': query,
"is_hit": is_hit,
"retrieved": retrieved_texts,
"expected": expected_text,
"query": query,
}
eval_results.append(eval_result)
return eval_results


rag_dataset = LabelledRagDataset.from_json("./data/rag_dataset.json")
documents = SimpleDirectoryReader(input_dir="./data/source_files").load_data()

embed_models = {
"bge": HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5"),
"colbert": HuggingFaceEmbedding(model_name="colbert-ir/colbertv2.0")
"bge": HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5"),
"colbert": HuggingFaceEmbedding(model_name="colbert-ir/colbertv2.0"),
}
rerankers = {
"None": None,
Expand All @@ -93,7 +104,7 @@ def evaluate(
verbose=True,
)
print(f" Embedder {embed_name} Reranker: {reranker_name}")
score = pd.DataFrame(eval_results)['is_hit'].mean()
score = pd.DataFrame(eval_results)["is_hit"].mean()
print(score)
scores[reranker_name] = score

Expand All @@ -108,6 +119,6 @@ def evaluate(
verbose=True,
)
print(f"Embedder {embed_name} Reranker: {reranker_name} (vector)")
score = pd.DataFrame(eval_results)['is_hit'].mean()
score = pd.DataFrame(eval_results)["is_hit"].mean()
print(score)
scores[f"{reranker_name}_vector"] = score

0 comments on commit bbc96f0

Please sign in to comment.