diff --git a/docs/source/use_cases/rag_documents.rst b/docs/source/use_cases/rag_documents.rst
deleted file mode 100644
index 26c8a430..00000000
--- a/docs/source/use_cases/rag_documents.rst
+++ /dev/null
@@ -1,322 +0,0 @@
-.. raw:: html
-
-
-
-RAG for documents
-=============================
-
-Overview
---------
-
-This implementation showcases an end-to-end RAG system capable of handling large-scale text files and
-generating context-aware responses. It is both modular and extensible, making it adaptable to various
-use cases and LLM APIs.
-
-**Imports**
-
-- **SentenceTransformer**: Used for creating dense vector embeddings for textual data.
-- **FAISS**: Provides efficient similarity search using vector indexing.
-- **tiktoken**: ensures that the text preprocessing aligns with the tokenization requirements of the underlying language models, making the pipeline robust and efficient.
-- **GroqAPIClient and OpenAIClient**: Custom classes for interacting with different LLM providers.
-- **ModelType**: Enum for specifying the model type.
-
-.. code-block:: python
-
- import os
- import tiktoken
- from typing import List, Dict, Tuple
- import numpy as np
- from sentence_transformers import SentenceTransformer
- from faiss import IndexFlatL2
-
- from adalflow.components.model_client import GroqAPIClient, OpenAIClient
- from adalflow.core.types import ModelType
- from adalflow.utils import setup_env
-
-The ``AdalflowRAGPipeline`` class sets up the Retrieval-Augmented Generation (RAG) pipeline. Its ``__init__`` method initializes key components:
-
-- An embedding model (``all-MiniLM-L6-v2``) is loaded using ``SentenceTransformer`` to convert text into dense vector embeddings with a dimensionality of 384.
-- A FAISS index (``IndexFlatL2``) is created for similarity-based document retrieval.
-- Parameters such as ``top_k_retrieval`` (number of documents to retrieve) and ``max_context_tokens`` (limit on token count in the context) are configured.
-- A tokenizer (``tiktoken``) ensures precise token counting, crucial for handling large language models (LLMs).
-
-The method also initializes storage for documents, their embeddings, and associated metadata for efficient management and retrieval.
-
-The ``AdalflowRAGPipeline`` class provides a flexible pipeline for Retrieval-Augmented Generation (RAG),
-initializing with parameters such as the embedding model (``all-MiniLM-L6-v2`` by default), vector dimension,
-top-k retrieval count, and token limits for context. It utilizes a tokenizer for token counting, a
-SentenceTransformer for embeddings, and a FAISS index for similarity searches, while also maintaining
-document data and metadata. The ``load_text_file`` method processes large text files into manageable chunks
-by splitting the content into fixed line groups, facilitating easier embedding and storage. To handle
-multiple files, ``add_documents_from_directory`` iterates over text files in a directory, embeds the content,
-and stores them in the FAISS index along with metadata. Token counting is achieved via the ``count_tokens``
-method, leveraging a tokenizer to precisely determine the number of tokens in a given text. The
-``retrieve_and_truncate_context`` method fetches the most relevant documents from the FAISS index based on
-query embeddings, truncating the context to adhere to token limits. Finally, the ``generate_response`` method
-constructs a comprehensive prompt by combining the retrieved context and query, invokes the provided model
-client for a response, and parses the results into a readable format. This pipeline demonstrates seamless
-integration of text retrieval and generation to handle large-scale document queries effectively.
-
-
-.. code-block:: python
-
- class AdalflowRAGPipeline:
- def __init__(self,
- model_client=None,
- model_kwargs=None,
- embedding_model='all-MiniLM-L6-v2',
- vector_dim=384,
- top_k_retrieval=3,
- max_context_tokens=800):
- """
- Initialize RAG Pipeline for handling large text files
-
- Args:
- embedding_model (str): Sentence transformer model for embeddings
- vector_dim (int): Dimension of embedding vectors
- top_k_retrieval (int): Number of documents to retrieve
- max_context_tokens (int): Maximum tokens to send to LLM
- """
- # Initialize model client for generation
- self.model_client = model_client
-
- # Initialize tokenizer for precise token counting
- self.tokenizer = tiktoken.get_encoding("cl100k_base")
-
- # Initialize embedding model
- self.embedding_model = SentenceTransformer(embedding_model)
-
- # Initialize FAISS index for vector similarity search
- self.index = IndexFlatL2(vector_dim)
-
- # Store document texts, embeddings, and metadata
- self.documents = []
- self.document_embeddings = []
- self.document_metadata = []
-
- # Retrieval and context management parameters
- self.top_k_retrieval = top_k_retrieval
- self.max_context_tokens = max_context_tokens
-
- # Model generation parameters
- self.model_kwargs = model_kwargs
-
- def load_text_file(self, file_path: str) -> List[str]:
- """
- Load a large text file and split into manageable chunks
-
- Args:
- file_path (str): Path to the text file
-
- Returns:
- List[str]: List of document chunks
- """
- with open(file_path, 'r', encoding='utf-8') as file:
- # Read entire file
- content = file.read()
-
- # Split content into chunks (e.g., 10 lines per chunk)
- lines = content.split('\n')
- chunks = []
- chunk_size = 10 # Adjust based on your file structure
-
- for i in range(0, len(lines), chunk_size):
- chunk = '\n'.join(lines[i:i+chunk_size])
- chunks.append(chunk)
-
- return chunks
-
- def add_documents_from_directory(self, directory_path: str):
- """
- Add documents from all text files in a directory
-
- Args:
- directory_path (str): Path to directory containing text files
- """
- for filename in os.listdir(directory_path):
- if filename.endswith('.txt'):
- file_path = os.path.join(directory_path, filename)
- document_chunks = self.load_text_file(file_path)
-
- for chunk in document_chunks:
- # Embed document chunk
- embedding = self.embedding_model.encode(chunk)
-
- # Add to index and document store
- self.index.add(np.array([embedding]))
- self.documents.append(chunk)
- self.document_embeddings.append(embedding)
- self.document_metadata.append({
- 'filename': filename,
- 'chunk_index': len(self.document_metadata)
- })
-
- def count_tokens(self, text: str) -> int:
- """
- Count tokens in a given text
-
- Args:
- text (str): Input text
-
- Returns:
- int: Number of tokens
- """
- return len(self.tokenizer.encode(text))
-
- def retrieve_and_truncate_context(self, query: str) -> str:
- """
- Retrieve relevant documents and truncate to fit token limit
-
- Args:
- query (str): Input query
-
- Returns:
- str: Concatenated context within token limit
- """
- # Retrieve relevant documents
- query_embedding = self.embedding_model.encode(query)
- distances, indices = self.index.search(
- np.array([query_embedding]),
- self.top_k_retrieval
- )
-
- # Collect and truncate context
- context = []
- current_tokens = 0
-
- for idx in indices[0]:
- doc = self.documents[idx]
- doc_tokens = self.count_tokens(doc)
-
- # Check if adding this document would exceed token limit
- if current_tokens + doc_tokens <= self.max_context_tokens:
- context.append(doc)
- current_tokens += doc_tokens
- else:
- break
-
- return "\n\n".join(context)
-
- def generate_response(self, query: str) -> str:
- """
- Generate a response using retrieval-augmented generation
-
- Args:
- query (str): User's input query
-
- Returns:
- str: Generated response incorporating retrieved context
- """
- # Retrieve and truncate context
- retrieved_context = self.retrieve_and_truncate_context(query)
-
- # Construct context-aware prompt
- full_prompt = f"""
- Context Documents:
- {retrieved_context}
-
- Query: {query}
-
- Generate a comprehensive response that:
- 1. Directly answers the query
- 2. Incorporates relevant information from the context documents
- 3. Provides clear and concise information
- """
-
- # Prepare API arguments
- api_kwargs = self.model_client.convert_inputs_to_api_kwargs(
- input=full_prompt,
- model_kwargs=self.model_kwargs,
- model_type=ModelType.LLM
- )
-
- # Call API and parse response
- response = self.model_client.call(
- api_kwargs=api_kwargs,
- model_type=ModelType.LLM
- )
- response_text = self.model_client.parse_chat_completion(response)
-
- return response_text
-
-The ``run_rag_pipeline`` function demonstrates how to use the ``AdalflowRAGPipeline``. It initializes the pipeline,
-adds documents from a directory, and generates responses for a list of user queries. The function is generic
-and can accommodate various LLM API clients, such as GroqAPIClient or OpenAIClient, highlighting the pipeline's
-flexibility and modularity.
-
-
-.. code-block:: python
-
- def run_rag_pipeline(model_client, model_kwargs, documents, queries):
-
- # Example usage of RAG pipeline
- rag_pipeline = AdalflowRAGPipeline(
- model_client=model_client,
- model_kwargs=model_kwargs,
- top_k_retrieval=1, # Retrieve top 1 most relevant chunks
- max_context_tokens=800 # Limit context to 1500 tokens
- )
-
- # Add documents from a directory of text files
- rag_pipeline.add_documents_from_directory(documents)
-
- # Generate responses
- for query in queries:
- print(f"\nQuery: {query}")
- response = rag_pipeline.generate_response(query)
- print(f"Response: {response}")
-
-
-This block provides an example of running the pipeline with different models and queries. It specifies:
-
-- The document directory containing the text files.
-- Example queries about topics such as the "Crystal Cavern" and "rare trees in Elmsworth."
-- Configuration for Groq and OpenAI model parameters, including the model type, temperature, and token limits.
-
-.. code-block:: python
-
- documents = '../../tutorials/assets/documents'
-
- queries = [
- "What year was the Crystal Cavern discovered?",
- "What is the name of the rare tree in Elmsworth?",
- "What local legend claim that Lunaflits surrounds?"
- ]
-
- groq_model_kwargs = {
- "model": "llama-3.2-1b-preview", # Use 16k model for larger context
- "temperature": 0.1,
- "max_tokens": 800,
- }
-
- openai_model_kwargs = {
- "model": "gpt-3.5-turbo",
- "temperature": 0.1,
- "max_tokens": 800,
- }
- # Below example shows that adalflow can be used in a genric manner for any api provider
- # without worrying about prompt and parsing results
- run_rag_pipeline(GroqAPIClient(), groq_model_kwargs, documents, queries)
- run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)
-
-The example emphasizes that ``AdalflowRAGPipeline`` can interact seamlessly with multiple API providers,
-enabling integration with diverse LLMs without modifying the core logic for prompt construction or
-response parsing.
-
-
-.. admonition:: API reference
- :class: highlight
-
- - :class:`utils.setup_env`
- - :class:`core.types.ModelType`
- - :class:`components.model_client.OpenAIClient`
- - :class:`components.model_client.GroqAPIClient`
diff --git a/docs/source/use_cases/rag_vanilla.rst b/docs/source/use_cases/rag_vanilla.rst
deleted file mode 100644
index 1a80bbbf..00000000
--- a/docs/source/use_cases/rag_vanilla.rst
+++ /dev/null
@@ -1,263 +0,0 @@
-.. raw:: html
-
-
-
-RAG Vanilla
-=============================
-
-Overview
---------
-
-The **RAG Vanilla** implementation is a Retrieval-Augmented Generation pipeline that integrates document
-retrieval with natural language generation. This approach allows users to retrieve contextually relevant
-documents from a knowledge base and generate informative responses. The code leverages components such as
-sentence embeddings, FAISS indexing, and a large language model (LLM) API client.
-
-
-**Imports**
-
-- **SentenceTransformer**: Used for creating dense vector embeddings for textual data.
-- **FAISS**: Provides efficient similarity search using vector indexing.
-- **GroqAPIClient and OpenAIClient**: Custom classes for interacting with different LLM providers.
-- **ModelType**: Enum for specifying the model type.
-
-.. code-block:: python
-
- import os
- from typing import List, Dict
- import numpy as np
- from sentence_transformers import SentenceTransformer
- from faiss import IndexFlatL2
-
- from adalflow.components.model_client import GroqAPIClient, OpenAIClient
- from adalflow.core.types import ModelType
- from adalflow.utils import setup_env
-
-
-**Pipeline Initialization**
-
-- **Model Client**: Abstracts the API calls to the chosen LLM provider.
-- **Embedding Model**: Defaulted to ``all-MiniLM-L6-v2``, generates 384-dimensional embeddings.
-- **Vector Dimension**: Dimensionality of the embedding vectors generated by the embedding model ``all-MiniLM-L6-v2``.
-- **Top K Retrieval**: Specifies the number of most relevant documents to retrieve.
-
-The ``add_documents()`` function encodes the documents into embeddings and stores them in the FAISS index. It
-uses the SentenceTransformer model to generate vector representations of the text. These embeddings are then
-added to the FAISS index for efficient similarity search and are also stored in a list for later retrieval.
-
-The ``retrieve_relevant_docs()`` function takes a query string as input and retrieves the top k documents
-that are most relevant to the query based on their similarity. The query is first encoded into an embedding,
-and then the FAISS index is used to perform a similarity search to identify the documents that are closest
-in meaning to the query.
-
-The ``generate_response()`` function constructs a context-aware prompt by combining the retrieved documents
-with the user's query. It then calls the model_client to generate a response from the language model. The
-conversation history is also maintained, logging each query and its corresponding response for future
-reference and context.
-
-
-.. code-block:: python
-
- class AdalflowRAGPipeline:
- def __init__(self,
- model_client = None,
- model_kwargs = None,
- embedding_model='all-MiniLM-L6-v2',
- vector_dim=384,
- top_k_retrieval=1):
- """
- Initialize RAG Pipeline with embedding and retrieval components
-
- Args:
- embedding_model (str): Sentence transformer model for embeddings
- vector_dim (int): Dimension of embedding vectors
- top_k_retrieval (int): Number of documents to retrieve
- """
- # Initialize model client for generation
- self.model_client = model_client
-
- # Initialize embedding model
- self.embedding_model = SentenceTransformer(embedding_model)
-
- # Initialize FAISS index for vector similarity search
- self.index = IndexFlatL2(vector_dim)
-
- # Store document texts and their embeddings
- self.documents = []
- self.document_embeddings = []
-
- # Retrieval parameters
- self.top_k_retrieval = top_k_retrieval
-
- # Conversation history and context
- self.conversation_history = ""
- self.model_kwargs = model_kwargs
-
- def add_documents(self, documents: List[str]):
- """
- Add documents to the RAG pipeline's knowledge base
-
- Args:
- documents (List[str]): List of document texts to add
- """
- for doc in documents:
- # Embed document
- embedding = self.embedding_model.encode(doc)
-
- # Add to index and document store
- self.index.add(np.array([embedding]))
- self.documents.append(doc)
- self.document_embeddings.append(embedding)
-
- def retrieve_relevant_docs(self, query: str) -> List[str]:
- """
- Retrieve most relevant documents for a given query
-
- Args:
- query (str): Input query to find relevant documents
-
- Returns:
- List[str]: Top k most relevant documents
- """
- # Embed query
- query_embedding = self.embedding_model.encode(query)
-
- # Perform similarity search
- distances, indices = self.index.search(
- np.array([query_embedding]),
- self.top_k_retrieval
- )
-
- # Retrieve and return top documents
- return [self.documents[i] for i in indices[0]]
-
- def generate_response(self, query: str) -> str:
- """
- Generate a response using retrieval-augmented generation
-
- Args:
- query (str): User's input query
-
- Returns:
- str: Generated response incorporating retrieved context
- """
- # Retrieve relevant documents
- retrieved_docs = self.retrieve_relevant_docs(query)
-
- # Construct context-aware prompt
- context = "\n\n".join([f"Context Document: {doc}" for doc in retrieved_docs])
- full_prompt = f"""
- Context:
- {context}
-
- Query: {query}
-
- Generate a comprehensive and informative response that:
- 1. Uses the provided context documents
- 2. Directly answers the query
- 3. Incorporates relevant information from the context
- """
-
- # Prepare API arguments
- api_kwargs = self.model_client.convert_inputs_to_api_kwargs(
- input=full_prompt,
- model_kwargs=self.model_kwargs,
- model_type=ModelType.LLM
- )
-
- # Call API and parse response
- response = self.model_client.call(
- api_kwargs=api_kwargs,
- model_type=ModelType.LLM
- )
- response_text = self.model_client.parse_chat_completion(response)
-
- # Update conversation history
- self.conversation_history += f"\nQuery: {query}\nResponse: {response_text}"
-
- return response_text
-
-**Running the Pipeline**
-
-- **Pipeline Workflow**:
- 1. Initializes the ``AdalflowRAGPipeline``.
- 2. Adds documents to the knowledge base.
- 3. Processes each query to retrieve documents and generate responses.
-
-
-.. code-block:: python
-
- def run_rag_pipeline(model_client, model_kwargs, documents, queries):
- rag_pipeline = AdalflowRAGPipeline(model_client=model_client, model_kwargs=model_kwargs)
-
- rag_pipeline.add_documents(documents)
-
- # Generate responses
- for query in queries:
- print(f"\nQuery: {query}")
- response = rag_pipeline.generate_response(query)
- print(f"Response: {response}")
-
-- **Documents**: Serve as the knowledge base for validation.
-- **Queries**: Designed to test retrieval and generation specific to document content.
-
-.. code-block:: python
-
- # ajithvcoder's statements are added so that we can validate that the LLM is generating from these lines only
- documents = [
- "ajithvcoder is a good person whom the world knows as Ajith Kumar, ajithvcoder is his nick name that AjithKumar gave himself",
- "The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair.",
- "ajithvcoder likes Hyderabadi panner dum briyani much.",
- "The Louvre Museum in Paris is the world's largest art museum, housing thousands of works of art.",
- "ajithvcoder has a engineering degree and he graduated on May, 2016."
- ]
-
- # Questions related to ajithvcoder's are added so that we can validate
- # that the LLM is generating from above given lines only
- queries = [
- "Does Ajith Kumar has any nick name ?",
- "What is the ajithvcoder's favourite food?",
- "When did ajithvcoder graduated ?"
- ]
-
-**API Integration**
-
-- **Generic API Client**: Demonstrates flexibility in using different LLM APIs like Groq and OpenAI without altering the core pipeline logic.
-
-.. code-block:: python
-
- groq_model_kwargs = {
- "model": "llama-3.2-1b-preview", # Use 16k model for larger context
- "temperature": 0.1,
- "max_tokens": 800,
- }
-
- openai_model_kwargs = {
- "model": "gpt-3.5-turbo", # Use 16k model for larger context
- "temperature": 0.1,
- "max_tokens": 800,
- }
-
- # Below example shows that adalflow can be used in a genric manner for any api provider
- # without worrying about prompt and parsing results
- model_client = GroqAPIClient()
- run_rag_pipeline(model_client, groq_model_kwargs, documents, queries)
- run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)
-
-
-.. admonition:: API reference
- :class: highlight
-
- - :class:`utils.setup_env`
- - :class:`core.types.ModelType`
- - :class:`components.model_client.OpenAIClient`
- - :class:`components.model_client.GroqAPIClient`
diff --git a/notebooks/tutorials/adalflow_rag_documents.ipynb b/notebooks/tutorials/adalflow_rag_documents.ipynb
deleted file mode 100644
index 373f6bae..00000000
--- a/notebooks/tutorials/adalflow_rag_documents.ipynb
+++ /dev/null
@@ -1,443 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 🤗 Welcome to AdalFlow!\n",
- "## The PyTorch library to auto-optimize any LLM task pipelines\n",
- "\n",
- "Thanks for trying us out, we're here to provide you with the best LLM application development experience you can dream of 😊 any questions or concerns you may have, [come talk to us on discord,](https://discord.gg/ezzszrRZvT) we're always here to help! ⭐ Star us on Github ⭐\n",
- "\n",
- "\n",
- "# Quick Links\n",
- "\n",
- "Github repo: https://github.com/SylphAI-Inc/AdalFlow\n",
- "\n",
- "Full Tutorials: https://adalflow.sylph.ai/index.html#.\n",
- "\n",
- "Deep dive on each API: check out the [developer notes](https://adalflow.sylph.ai/tutorials/index.html).\n",
- "\n",
- "Common use cases along with the auto-optimization: check out [Use cases](https://adalflow.sylph.ai/use_cases/index.html).\n",
- "\n",
- "# Author\n",
- "This notebook was created by community contributor [Ajith](https://github.com/ajithvcoder/).\n",
- "\n",
- "# Outline\n",
- "\n",
- "This is a quick introduction of what AdalFlow is capable of. We will cover:\n",
- "\n",
- "* How to use adalflow for rag with documents\n",
- "\n",
- "Adalflow can be used in a genric manner for any api provider without worrying much about prompt, \n",
- "model args and parsing results\n",
- "\n",
- "**Next: Try our [adalflow-text-splitter](\"https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/tutorials/adalflow_text_splitter.ipynb\")**\n",
- "\n",
- "\n",
- "# Installation\n",
- "\n",
- "1. Use `pip` to install the `adalflow` Python package. We will need `openai`, `groq`, and `faiss`(cpu version) from the extra packages.\n",
- "\n",
- " ```bash\n",
- " pip install torch --index-url https://download.pytorch.org/whl/cpu\n",
- " pip install sentence-transformers==3.3.1\n",
- " pip install adalflow[openai,groq,faiss-cpu]\n",
- " ```\n",
- "2. Setup `openai` and `groq` API key in the environment variables"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Set Environment Variables\n",
- "\n",
- "Note: Enter your api keys in below cell #todo"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Overwriting .env\n"
- ]
- }
- ],
- "source": [
- "%%writefile .env\n",
- "\n",
- "OPENAI_API_KEY=\"PASTE-OPENAI_API_KEY_HERE\"\n",
- "GROQ_API_KEY=\"PASTE-GROQ_API_KEY-HERE\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [],
- "source": [
- "from adalflow.utils import setup_env\n",
- "\n",
- "# Load environment variables - Make sure to have OPENAI_API_KEY in .env file and .env is present in current folder\n",
- "setup_env(\".env\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "/workspace/ajithdev/AdalFlow/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
- " from .autonotebook import tqdm as notebook_tqdm\n"
- ]
- }
- ],
- "source": [
- "import os\n",
- "import tiktoken\n",
- "from typing import List, Dict, Tuple\n",
- "import numpy as np\n",
- "from sentence_transformers import SentenceTransformer\n",
- "from faiss import IndexFlatL2\n",
- "\n",
- "from adalflow.components.model_client import GroqAPIClient, OpenAIClient\n",
- "from adalflow.core.types import ModelType\n",
- "from adalflow.utils import setup_env"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`AdalflowRAGPipeline` is a class that implements a Retrieval-Augmented Generation (RAG) pipeline with adalflow using documents. It has:\n",
- "\n",
- "- Efficient RAG Pipeline for handling large text files, embedding, and retrieval.\n",
- "- Supports token management and context truncation for LLM integration.\n",
- "- Generates accurate responses using retrieval-augmented generation (RAG)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [],
- "source": [
- "class AdalflowRAGPipeline:\n",
- " def __init__(self,\n",
- " model_client=None,\n",
- " model_kwargs=None,\n",
- " embedding_model='all-MiniLM-L6-v2', \n",
- " vector_dim=384, \n",
- " top_k_retrieval=3,\n",
- " max_context_tokens=800):\n",
- " \"\"\"\n",
- " Initialize RAG Pipeline for handling large text files\n",
- " \n",
- " Args:\n",
- " embedding_model (str): Sentence transformer model for embeddings\n",
- " vector_dim (int): Dimension of embedding vectors\n",
- " top_k_retrieval (int): Number of documents to retrieve\n",
- " max_context_tokens (int): Maximum tokens to send to LLM\n",
- " \"\"\"\n",
- " # Initialize model client for generation\n",
- " self.model_client = model_client\n",
- " \n",
- " # Initialize tokenizer for precise token counting\n",
- " self.tokenizer = tiktoken.get_encoding(\"cl100k_base\")\n",
- " \n",
- " # Initialize embedding model\n",
- " self.embedding_model = SentenceTransformer(embedding_model)\n",
- " \n",
- " # Initialize FAISS index for vector similarity search\n",
- " self.index = IndexFlatL2(vector_dim)\n",
- " \n",
- " # Store document texts, embeddings, and metadata\n",
- " self.documents = []\n",
- " self.document_embeddings = []\n",
- " self.document_metadata = []\n",
- " \n",
- " # Retrieval and context management parameters\n",
- " self.top_k_retrieval = top_k_retrieval\n",
- " self.max_context_tokens = max_context_tokens\n",
- " \n",
- " # Model generation parameters\n",
- " self.model_kwargs = model_kwargs\n",
- "\n",
- " def load_text_file(self, file_path: str) -> List[str]:\n",
- " \"\"\"\n",
- " Load a large text file and split into manageable chunks\n",
- " \n",
- " Args:\n",
- " file_path (str): Path to the text file\n",
- " \n",
- " Returns:\n",
- " List[str]: List of document chunks\n",
- " \"\"\"\n",
- " with open(file_path, 'r', encoding='utf-8') as file:\n",
- " # Read entire file\n",
- " content = file.read()\n",
- " \n",
- " # Split content into chunks (e.g., 10 lines per chunk)\n",
- " lines = content.split('\\n')\n",
- " chunks = []\n",
- " chunk_size = 10 # Adjust based on your file structure\n",
- " \n",
- " for i in range(0, len(lines), chunk_size):\n",
- " chunk = '\\n'.join(lines[i:i+chunk_size])\n",
- " chunks.append(chunk)\n",
- " \n",
- " return chunks\n",
- "\n",
- " def add_documents_from_directory(self, directory_path: str):\n",
- " \"\"\"\n",
- " Add documents from all text files in a directory\n",
- " \n",
- " Args:\n",
- " directory_path (str): Path to directory containing text files\n",
- " \"\"\"\n",
- " for filename in os.listdir(directory_path):\n",
- " if filename.endswith('.txt'):\n",
- " file_path = os.path.join(directory_path, filename)\n",
- " document_chunks = self.load_text_file(file_path)\n",
- " \n",
- " for chunk in document_chunks:\n",
- " # Embed document chunk\n",
- " embedding = self.embedding_model.encode(chunk)\n",
- " \n",
- " # Add to index and document store\n",
- " self.index.add(np.array([embedding]))\n",
- " self.documents.append(chunk)\n",
- " self.document_embeddings.append(embedding)\n",
- " self.document_metadata.append({\n",
- " 'filename': filename,\n",
- " 'chunk_index': len(self.document_metadata)\n",
- " })\n",
- "\n",
- " def count_tokens(self, text: str) -> int:\n",
- " \"\"\"\n",
- " Count tokens in a given text\n",
- " \n",
- " Args:\n",
- " text (str): Input text\n",
- " \n",
- " Returns:\n",
- " int: Number of tokens\n",
- " \"\"\"\n",
- " return len(self.tokenizer.encode(text))\n",
- "\n",
- " def retrieve_and_truncate_context(self, query: str) -> str:\n",
- " \"\"\"\n",
- " Retrieve relevant documents and truncate to fit token limit\n",
- " \n",
- " Args:\n",
- " query (str): Input query\n",
- " \n",
- " Returns:\n",
- " str: Concatenated context within token limit\n",
- " \"\"\"\n",
- " # Retrieve relevant documents\n",
- " query_embedding = self.embedding_model.encode(query)\n",
- " distances, indices = self.index.search(\n",
- " np.array([query_embedding]), \n",
- " self.top_k_retrieval\n",
- " )\n",
- " \n",
- " # Collect and truncate context\n",
- " context = []\n",
- " current_tokens = 0\n",
- " \n",
- " for idx in indices[0]:\n",
- " doc = self.documents[idx]\n",
- " doc_tokens = self.count_tokens(doc)\n",
- " \n",
- " # Check if adding this document would exceed token limit\n",
- " if current_tokens + doc_tokens <= self.max_context_tokens:\n",
- " context.append(doc)\n",
- " current_tokens += doc_tokens\n",
- " else:\n",
- " break\n",
- " \n",
- " return \"\\n\\n\".join(context)\n",
- "\n",
- " def generate_response(self, query: str) -> str:\n",
- " \"\"\"\n",
- " Generate a response using retrieval-augmented generation\n",
- " \n",
- " Args:\n",
- " query (str): User's input query\n",
- " \n",
- " Returns:\n",
- " str: Generated response incorporating retrieved context\n",
- " \"\"\"\n",
- " # Retrieve and truncate context\n",
- " retrieved_context = self.retrieve_and_truncate_context(query)\n",
- " \n",
- " # Construct context-aware prompt\n",
- " full_prompt = f\"\"\"\n",
- " Context Documents:\n",
- " {retrieved_context}\n",
- " \n",
- " Query: {query}\n",
- " \n",
- " Generate a comprehensive response that:\n",
- " 1. Directly answers the query\n",
- " 2. Incorporates relevant information from the context documents\n",
- " 3. Provides clear and concise information\n",
- " \"\"\"\n",
- " \n",
- " # Prepare API arguments\n",
- " api_kwargs = self.model_client.convert_inputs_to_api_kwargs(\n",
- " input=full_prompt,\n",
- " model_kwargs=self.model_kwargs,\n",
- " model_type=ModelType.LLM\n",
- " )\n",
- " \n",
- " # Call API and parse response\n",
- " response = self.model_client.call(\n",
- " api_kwargs=api_kwargs, \n",
- " model_type=ModelType.LLM\n",
- " )\n",
- " response_text = self.model_client.parse_chat_completion(response)\n",
- " \n",
- " return response_text\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`run_rag_pipeline` demonstrates how to use the AdalflowRAGPipeline to handle retrieval-augmented generation. It initializes the pipeline with specified retrieval and context token limits, loads documents from a directory, and processes a list of queries. For each query, the function retrieves relevant context, generates a response using the pipeline, and prints the results."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [],
- "source": [
- "def run_rag_pipeline(model_client, model_kwargs, documents, queries):\n",
- "\n",
- " # Example usage of RAG pipeline\n",
- " rag_pipeline = AdalflowRAGPipeline(\n",
- " model_client=model_client,\n",
- " model_kwargs=model_kwargs,\n",
- " top_k_retrieval=1, # Retrieve top 1 most relevant chunks\n",
- " max_context_tokens=800 # Limit context to 1500 tokens\n",
- " )\n",
- "\n",
- " # Add documents from a directory of text files\n",
- " rag_pipeline.add_documents_from_directory(documents)\n",
- " \n",
- " # Generate responses\n",
- " for query in queries:\n",
- " print(f\"\\nQuery: {query}\")\n",
- " response = rag_pipeline.generate_response(query)\n",
- " print(f\"Response: {response}\")\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\n",
- "Query: What year was the Crystal Cavern discovered?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=14, prompt_tokens=203, total_tokens=217), raw_response='The Crystal Cavern was discovered in 1987 by divers.', metadata=None)\n",
- "\n",
- "Query: What is the name of the rare tree in Elmsworth?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=17, prompt_tokens=212, total_tokens=229), raw_response='The rare tree in Elmsworth is known as the \"Moonshade Willow\".', metadata=None)\n",
- "\n",
- "Query: What local legend claim that Lunaflits surrounds?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=19, prompt_tokens=206, total_tokens=225), raw_response='Local legend claims that Lunaflits are guardians of ancient treasure buried deep within the canyon.', metadata=None)\n",
- "\n",
- "Query: What year was the Crystal Cavern discovered?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=107, prompt_tokens=184, total_tokens=291), raw_response='The Crystal Cavern was discovered by divers in the year 1987 beneath the icy waters of Lake Aurora. The cavern is known for its shimmering quartz formations that refract sunlight into a spectrum of colors, and it is believed to have served as a sanctuary for an ancient civilization that revered the crystals as conduits to the spirit world. Artifacts recovered from the cavern are carved with intricate symbols, indicating a deep connection to celestial events. However, accessing the cavern is dangerous due to the freezing temperatures and strong currents of the lake.', metadata=None)\n",
- "\n",
- "Query: What is the name of the rare tree in Elmsworth?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=104, prompt_tokens=193, total_tokens=297), raw_response='The rare tree in Elmsworth is called the \"Moonshade Willow.\" It blooms once every seven years, emitting a soft glow from its blossoms. Villagers believe that meditating under its branches brings vivid dreams of the future. The tree\\'s bark contains a secret resin used in ancient healing rituals. Elders claim that the Moonshade Willow was a gift from a goddess to protect the village. Researchers have found that the tree can only thrive in Elmsworth\\'s unique soil, making it impossible to cultivate elsewhere.', metadata=None)\n",
- "\n",
- "Query: What local legend claim that Lunaflits surrounds?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=100, prompt_tokens=187, total_tokens=287), raw_response='Local legends claim that Lunaflits, the glowing insects found in the remote desert canyon, are believed to be guardians of ancient treasure buried deep within the canyon. These creatures emit a constant, soothing green light that illuminates the canyon at night, and their rhythmic light pulses form intricate patterns, suggesting a form of communication among them. The ethereal glow created by the Lunaflits and the rare moss reflecting their light have contributed to the mystical reputation of these insects as protectors of hidden riches.', metadata=None)\n"
- ]
- }
- ],
- "source": [
- "# setup_env()\n",
- "\n",
- "documents = '../../tutorials/assets/documents'\n",
- "\n",
- "queries = [\n",
- " \"What year was the Crystal Cavern discovered?\",\n",
- " \"What is the name of the rare tree in Elmsworth?\",\n",
- " \"What local legend claim that Lunaflits surrounds?\"\n",
- "]\n",
- "\n",
- "groq_model_kwargs = {\n",
- " \"model\": \"llama-3.2-1b-preview\", # Use 16k model for larger context\n",
- " \"temperature\": 0.1,\n",
- " \"max_tokens\": 800,\n",
- "}\n",
- "\n",
- "openai_model_kwargs = {\n",
- " \"model\": \"gpt-3.5-turbo\",\n",
- " \"temperature\": 0.1,\n",
- " \"max_tokens\": 800,\n",
- "}\n",
- "# Below example shows that adalflow can be used in a genric manner for any api provider\n",
- "# without worrying about prompt and parsing results\n",
- "run_rag_pipeline(GroqAPIClient(), groq_model_kwargs, documents, queries)\n",
- "run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": ".venv",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.12.7"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/notebooks/tutorials/adalflow_rag_vanilla.ipynb b/notebooks/tutorials/adalflow_rag_vanilla.ipynb
deleted file mode 100644
index 34a53174..00000000
--- a/notebooks/tutorials/adalflow_rag_vanilla.ipynb
+++ /dev/null
@@ -1,376 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 🤗 Welcome to AdalFlow!\n",
- "## The PyTorch library to auto-optimize any LLM task pipelines\n",
- "\n",
- "Thanks for trying us out, we're here to provide you with the best LLM application development experience you can dream of 😊 any questions or concerns you may have, [come talk to us on discord,](https://discord.gg/ezzszrRZvT) we're always here to help! ⭐ Star us on Github ⭐\n",
- "\n",
- "\n",
- "# Quick Links\n",
- "\n",
- "Github repo: https://github.com/SylphAI-Inc/AdalFlow\n",
- "\n",
- "Full Tutorials: https://adalflow.sylph.ai/index.html#.\n",
- "\n",
- "Deep dive on each API: check out the [developer notes](https://adalflow.sylph.ai/tutorials/index.html).\n",
- "\n",
- "Common use cases along with the auto-optimization: check out [Use cases](https://adalflow.sylph.ai/use_cases/index.html).\n",
- "\n",
- "# Author\n",
- "This notebook was created by community contributor [Ajith](https://github.com/ajithvcoder/).\n",
- "\n",
- "# Outline\n",
- "\n",
- "This is a quick introduction of what AdalFlow is capable of. We will cover:\n",
- "\n",
- "* How to use adalflow for rag\n",
- "\n",
- "Adalflow can be used in a genric manner for any api provider without worrying much about prompt, \n",
- "model args and parsing results\n",
- "\n",
- "**Next: Try our [adalflow-rag-for-documents](\"https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/tutorials/adalflow_rag_documents.ipynb\")**\n",
- "\n",
- "\n",
- "# Installation\n",
- "\n",
- "1. Use `pip` to install the `adalflow` Python package. We will need `openai`, `groq`, and `faiss`(cpu version) from the extra packages.\n",
- "\n",
- " ```bash\n",
- " pip install torch --index-url https://download.pytorch.org/whl/cpu\n",
- " pip install sentence-transformers==3.3.1\n",
- " pip install adalflow[openai,groq,faiss-cpu]\n",
- " ```\n",
- "2. Setup `openai` and `groq` API key in the environment variables"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Set Environment Variables\n",
- "\n",
- "Note: Enter your api keys in below cell #todo"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Overwriting .env\n"
- ]
- }
- ],
- "source": [
- "%%writefile .env\n",
- "\n",
- "OPENAI_API_KEY=\"PASTE-OPENAI_API_KEY_HERE\"\n",
- "GROQ_API_KEY=\"PASTE-GROQ_API_KEY-HERE\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [],
- "source": [
- "from adalflow.utils import setup_env\n",
- "\n",
- "# Load environment variables - Make sure to have OPENAI_API_KEY in .env file and .env is present in current folder\n",
- "setup_env(\".env\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "/workspace/ajithdev/AdalFlow/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
- " from .autonotebook import tqdm as notebook_tqdm\n"
- ]
- }
- ],
- "source": [
- "import os\n",
- "from typing import List, Dict\n",
- "import numpy as np\n",
- "from sentence_transformers import SentenceTransformer\n",
- "from faiss import IndexFlatL2\n",
- "\n",
- "from adalflow.components.model_client import GroqAPIClient, OpenAIClient\n",
- "from adalflow.core.types import ModelType\n",
- "from adalflow.utils import setup_env"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`AdalflowRAGPipeline` is a class that implements a Retrieval-Augmented Generation (RAG) pipeline with adalflow. It integrates:\n",
- "\n",
- "- Embedding models (e.g., Sentence Transformers) for document and query embeddings.\n",
- "- FAISS for vector similarity search.\n",
- "- A LLM client to generate context-aware responses using retrieved documents."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [],
- "source": [
- "class AdalflowRAGPipeline:\n",
- " def __init__(self, \n",
- " model_client = None,\n",
- " model_kwargs = None,\n",
- " embedding_model='all-MiniLM-L6-v2', \n",
- " vector_dim=384, \n",
- " top_k_retrieval=1):\n",
- " \"\"\" \n",
- " Initialize RAG Pipeline with embedding and retrieval components\n",
- " \n",
- " Args:\n",
- " embedding_model (str): Sentence transformer model for embeddings\n",
- " vector_dim (int): Dimension of embedding vectors\n",
- " top_k_retrieval (int): Number of documents to retrieve\n",
- " \"\"\"\n",
- " # Initialize model client for generation\n",
- " self.model_client = model_client\n",
- " \n",
- " # Initialize embedding model\n",
- " self.embedding_model = SentenceTransformer(embedding_model)\n",
- " \n",
- " # Initialize FAISS index for vector similarity search\n",
- " self.index = IndexFlatL2(vector_dim)\n",
- " \n",
- " # Store document texts and their embeddings\n",
- " self.documents = []\n",
- " self.document_embeddings = []\n",
- " \n",
- " # Retrieval parameters\n",
- " self.top_k_retrieval = top_k_retrieval\n",
- " \n",
- " # Conversation history and context\n",
- " self.conversation_history = \"\"\n",
- " self.model_kwargs = model_kwargs\n",
- "\n",
- " def add_documents(self, documents: List[str]):\n",
- " \"\"\"\n",
- " Add documents to the RAG pipeline's knowledge base\n",
- " \n",
- " Args:\n",
- " documents (List[str]): List of document texts to add\n",
- " \"\"\"\n",
- " for doc in documents:\n",
- " # Embed document\n",
- " embedding = self.embedding_model.encode(doc)\n",
- " \n",
- " # Add to index and document store\n",
- " self.index.add(np.array([embedding]))\n",
- " self.documents.append(doc)\n",
- " self.document_embeddings.append(embedding)\n",
- "\n",
- " def retrieve_relevant_docs(self, query: str) -> List[str]:\n",
- " \"\"\"\n",
- " Retrieve most relevant documents for a given query\n",
- " \n",
- " Args:\n",
- " query (str): Input query to find relevant documents\n",
- " \n",
- " Returns:\n",
- " List[str]: Top k most relevant documents\n",
- " \"\"\"\n",
- " # Embed query\n",
- " query_embedding = self.embedding_model.encode(query)\n",
- " \n",
- " # Perform similarity search\n",
- " distances, indices = self.index.search(\n",
- " np.array([query_embedding]), \n",
- " self.top_k_retrieval\n",
- " )\n",
- " \n",
- " # Retrieve and return top documents\n",
- " return [self.documents[i] for i in indices[0]]\n",
- "\n",
- " def generate_response(self, query: str) -> str:\n",
- " \"\"\"\n",
- " Generate a response using retrieval-augmented generation\n",
- " \n",
- " Args:\n",
- " query (str): User's input query\n",
- " \n",
- " Returns:\n",
- " str: Generated response incorporating retrieved context\n",
- " \"\"\"\n",
- " # Retrieve relevant documents\n",
- " retrieved_docs = self.retrieve_relevant_docs(query)\n",
- " \n",
- " # Construct context-aware prompt\n",
- " context = \"\\n\\n\".join([f\"Context Document: {doc}\" for doc in retrieved_docs])\n",
- " full_prompt = f\"\"\"\n",
- " Context:\n",
- " {context}\n",
- " \n",
- " Query: {query}\n",
- " \n",
- " Generate a comprehensive and informative response that:\n",
- " 1. Uses the provided context documents\n",
- " 2. Directly answers the query\n",
- " 3. Incorporates relevant information from the context\n",
- " \"\"\"\n",
- " \n",
- " # Prepare API arguments\n",
- " api_kwargs = self.model_client.convert_inputs_to_api_kwargs(\n",
- " input=full_prompt,\n",
- " model_kwargs=self.model_kwargs,\n",
- " model_type=ModelType.LLM\n",
- " )\n",
- " \n",
- " # Call API and parse response\n",
- " response = self.model_client.call(\n",
- " api_kwargs=api_kwargs, \n",
- " model_type=ModelType.LLM\n",
- " )\n",
- " response_text = self.model_client.parse_chat_completion(response)\n",
- " \n",
- " # Update conversation history\n",
- " self.conversation_history += f\"\\nQuery: {query}\\nResponse: {response_text}\"\n",
- " \n",
- " return response_text\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The `run_rag_pipeline` function demonstrates how to use the AdalflowRAGPipeline for embedding documents, retrieving relevant context, and generating responses:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [],
- "source": [
- "def run_rag_pipeline(model_client, model_kwargs, documents, queries):\n",
- " rag_pipeline = AdalflowRAGPipeline(model_client=model_client, model_kwargs=model_kwargs)\n",
- "\n",
- " rag_pipeline.add_documents(documents)\n",
- "\n",
- " # Generate responses\n",
- " for query in queries:\n",
- " print(f\"\\nQuery: {query}\")\n",
- " response = rag_pipeline.generate_response(query)\n",
- " print(f\"Response: {response}\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\n",
- "Query: Does Ajith Kumar has any nick name ?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=78, prompt_tokens=122, total_tokens=200), raw_response='Based on the provided context documents, Ajith Kumar, also known as Ajithvcoder, has a nickname that he has given himself. According to the context, Ajithvcoder is his nickname that he has chosen for himself.\\n\\nTherefore, the answer to the query is:\\n\\nYes, Ajith Kumar has a nickname that he has given himself, which is Ajithvcoder.', metadata=None)\n",
- "\n",
- "Query: What is the ajithvcoder's favourite food?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=67, prompt_tokens=109, total_tokens=176), raw_response='Based on the provided context document, I can confidently answer the query as follows:\\n\\nAjithvcoder\\'s favourite food is Hyderabadi Panner Dum Briyani.\\n\\nThis answer is directly supported by the context document, which states: \"ajithvcoder likes Hyderabadi panner dum briyani much.\"', metadata=None)\n",
- "\n",
- "Query: When did ajithvcoder graduated ?\n",
- "Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=57, prompt_tokens=107, total_tokens=164), raw_response=\"Based on the provided context documents, we can determine that Ajith V.Coder graduated on May 2016.\\n\\nHere's a comprehensive and informative response that directly answers the query:\\n\\nAjith V.Coder graduated on May 2016, which is mentioned in the context document.\", metadata=None)\n"
- ]
- }
- ],
- "source": [
- "# setup_env()\n",
- "\n",
- "# ajithvcoder's statements are added so that we can validate that the LLM is generating from these lines only\n",
- "documents = [\n",
- " \"ajithvcoder is a good person whom the world knows as Ajith Kumar, ajithvcoder is his nick name that AjithKumar gave himself\",\n",
- " \"The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair.\",\n",
- " \"ajithvcoder likes Hyderabadi panner dum briyani much.\",\n",
- " \"The Louvre Museum in Paris is the world's largest art museum, housing thousands of works of art.\",\n",
- " \"ajithvcoder has a engineering degree and he graduated on May, 2016.\"\n",
- "]\n",
- "\n",
- "# Questions related to ajithvcoder's are added so that we can validate\n",
- "# that the LLM is generating from above given lines only\n",
- "queries = [\n",
- " \"Does Ajith Kumar has any nick name ?\",\n",
- " \"What is the ajithvcoder's favourite food?\",\n",
- " \"When did ajithvcoder graduated ?\"\n",
- "]\n",
- "\n",
- "groq_model_kwargs = {\n",
- " \"model\": \"llama-3.2-1b-preview\", # Use 16k model for larger context\n",
- " \"temperature\": 0.1,\n",
- " \"max_tokens\": 800,\n",
- "}\n",
- "\n",
- "openai_model_kwargs = {\n",
- " \"model\": \"gpt-3.5-turbo\", # Use 16k model for larger context\n",
- " \"temperature\": 0.1,\n",
- " \"max_tokens\": 800,\n",
- "}\n",
- "\n",
- "# Below example shows that adalflow can be used in a genric manner for any api provider\n",
- "# without worrying about prompt and parsing results\n",
- "model_client = GroqAPIClient()\n",
- "run_rag_pipeline(model_client, groq_model_kwargs, documents, queries)\n",
- "run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)\n"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": ".venv",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.12.7"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/tutorials/adalflow_rag_documents.py b/tutorials/adalflow_rag_documents.py
deleted file mode 100644
index e1807b2d..00000000
--- a/tutorials/adalflow_rag_documents.py
+++ /dev/null
@@ -1,248 +0,0 @@
-import os
-import tiktoken
-from typing import List
-import numpy as np
-from sentence_transformers import SentenceTransformer
-from faiss import IndexFlatL2
-
-from adalflow.components.model_client import GroqAPIClient, OpenAIClient
-from adalflow.core.types import ModelType
-from adalflow.utils import setup_env
-
-"""
-pip install torch --index-url https://download.pytorch.org/whl/cpu
-pip install sentence-transformers==3.3.1
-pip install faiss-cpu==1.9.0.post1
-"""
-
-
-class AdalflowRAGPipeline:
- def __init__(
- self,
- model_client=None,
- model_kwargs=None,
- embedding_model="all-MiniLM-L6-v2",
- vector_dim=384,
- top_k_retrieval=3,
- max_context_tokens=800,
- ):
- """
- Initialize RAG Pipeline for handling large text files
-
- Args:
- embedding_model (str): Sentence transformer model for embeddings
- vector_dim (int): Dimension of embedding vectors
- top_k_retrieval (int): Number of documents to retrieve
- max_context_tokens (int): Maximum tokens to send to LLM
- """
- # Initialize model client for generation
- self.model_client = model_client
-
- # Initialize tokenizer for precise token counting
- self.tokenizer = tiktoken.get_encoding("cl100k_base")
-
- # Initialize embedding model
- self.embedding_model = SentenceTransformer(embedding_model)
-
- # Initialize FAISS index for vector similarity search
- self.index = IndexFlatL2(vector_dim)
-
- # Store document texts, embeddings, and metadata
- self.documents = []
- self.document_embeddings = []
- self.document_metadata = []
-
- # Retrieval and context management parameters
- self.top_k_retrieval = top_k_retrieval
- self.max_context_tokens = max_context_tokens
-
- # Model generation parameters
- self.model_kwargs = model_kwargs
-
- def load_text_file(self, file_path: str) -> List[str]:
- """
- Load a large text file and split into manageable chunks
-
- Args:
- file_path (str): Path to the text file
-
- Returns:
- List[str]: List of document chunks
- """
- with open(file_path, "r", encoding="utf-8") as file:
- # Read entire file
- content = file.read()
-
- # Split content into chunks (e.g., 10 lines per chunk)
- lines = content.split("\n")
- chunks = []
- chunk_size = 10 # Adjust based on your file structure
-
- for i in range(0, len(lines), chunk_size):
- chunk = "\n".join(lines[i : i + chunk_size])
- chunks.append(chunk)
-
- return chunks
-
- def add_documents_from_directory(self, directory_path: str):
- """
- Add documents from all text files in a directory
-
- Args:
- directory_path (str): Path to directory containing text files
- """
- for filename in os.listdir(directory_path):
- if filename.endswith(".txt"):
- file_path = os.path.join(directory_path, filename)
- document_chunks = self.load_text_file(file_path)
-
- for chunk in document_chunks:
- # Embed document chunk
- embedding = self.embedding_model.encode(chunk)
-
- # Add to index and document store
- self.index.add(np.array([embedding]))
- self.documents.append(chunk)
- self.document_embeddings.append(embedding)
- self.document_metadata.append(
- {
- "filename": filename,
- "chunk_index": len(self.document_metadata),
- }
- )
-
- def count_tokens(self, text: str) -> int:
- """
- Count tokens in a given text
-
- Args:
- text (str): Input text
-
- Returns:
- int: Number of tokens
- """
- return len(self.tokenizer.encode(text))
-
- def retrieve_and_truncate_context(self, query: str) -> str:
- """
- Retrieve relevant documents and truncate to fit token limit
-
- Args:
- query (str): Input query
-
- Returns:
- str: Concatenated context within token limit
- """
- # Retrieve relevant documents
- query_embedding = self.embedding_model.encode(query)
- distances, indices = self.index.search(
- np.array([query_embedding]), self.top_k_retrieval
- )
-
- # Collect and truncate context
- context = []
- current_tokens = 0
-
- for idx in indices[0]:
- doc = self.documents[idx]
- doc_tokens = self.count_tokens(doc)
-
- # Check if adding this document would exceed token limit
- if current_tokens + doc_tokens <= self.max_context_tokens:
- context.append(doc)
- current_tokens += doc_tokens
- else:
- break
-
- return "\n\n".join(context)
-
- def generate_response(self, query: str) -> str:
- """
- Generate a response using retrieval-augmented generation
-
- Args:
- query (str): User's input query
-
- Returns:
- str: Generated response incorporating retrieved context
- """
- # Retrieve and truncate context
- retrieved_context = self.retrieve_and_truncate_context(query)
-
- # Construct context-aware prompt
- full_prompt = f"""
- Context Documents:
- {retrieved_context}
-
- Query: {query}
-
- Generate a comprehensive response that:
- 1. Directly answers the query
- 2. Incorporates relevant information from the context documents
- 3. Provides clear and concise information
- """
-
- # Prepare API arguments
- api_kwargs = self.model_client.convert_inputs_to_api_kwargs(
- input=full_prompt, model_kwargs=self.model_kwargs, model_type=ModelType.LLM
- )
-
- # Call API and parse response
- response = self.model_client.call(
- api_kwargs=api_kwargs, model_type=ModelType.LLM
- )
- response_text = self.model_client.parse_chat_completion(response)
-
- return response_text
-
-
-def run_rag_pipeline(model_client, model_kwargs, documents, queries):
-
- # Example usage of RAG pipeline
- rag_pipeline = AdalflowRAGPipeline(
- model_client=model_client,
- model_kwargs=model_kwargs,
- top_k_retrieval=2, # Retrieve top 3 most relevant chunks
- max_context_tokens=800, # Limit context to 1500 tokens
- )
-
- # Add documents from a directory of text files
- rag_pipeline.add_documents_from_directory(documents)
-
- # Generate responses
- for query in queries:
- print(f"\nQuery: {query}")
- response = rag_pipeline.generate_response(query)
- print(f"Response: {response}")
-
-
-def main():
- setup_env()
-
- documents = "./tutorials/assets/documents"
-
- queries = [
- "What year was the Crystal Cavern discovered?",
- "What is the name of the rare tree in Elmsworth?",
- "What local legend claim that Lunaflits surrounds?",
- ]
-
- groq_model_kwargs = {
- "model": "llama-3.2-1b-preview", # Use 16k model for larger context
- "temperature": 0.1,
- "max_tokens": 800,
- }
-
- openai_model_kwargs = {
- "model": "gpt-3.5-turbo",
- "temperature": 0.1,
- "max_tokens": 800,
- }
- # Below example shows that adalflow can be used in a genric manner for any api provider
- # without worrying about prompt and parsing results
- run_rag_pipeline(GroqAPIClient(), groq_model_kwargs, documents, queries)
- run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)
-
-
-if __name__ == "__main__":
- main()
diff --git a/tutorials/adalflow_rag_vanilla.py b/tutorials/adalflow_rag_vanilla.py
deleted file mode 100644
index 36af7997..00000000
--- a/tutorials/adalflow_rag_vanilla.py
+++ /dev/null
@@ -1,188 +0,0 @@
-from typing import List
-import numpy as np
-from sentence_transformers import SentenceTransformer
-from faiss import IndexFlatL2
-
-from adalflow.components.model_client import GroqAPIClient, OpenAIClient
-from adalflow.core.types import ModelType
-from adalflow.utils import setup_env
-
-"""
-pip install torch --index-url https://download.pytorch.org/whl/cpu
-pip install sentence-transformers==3.3.1
-pip install faiss-cpu==1.9.0.post1
-"""
-
-
-class AdalflowRAGPipeline:
- def __init__(
- self,
- model_client=None,
- model_kwargs=None,
- embedding_model="all-MiniLM-L6-v2",
- vector_dim=384,
- top_k_retrieval=1,
- ):
- """
- Initialize RAG Pipeline with embedding and retrieval components
-
- Args:
- embedding_model (str): Sentence transformer model for embeddings
- vector_dim (int): Dimension of embedding vectors
- top_k_retrieval (int): Number of documents to retrieve
- """
- # Initialize model client for generation
- self.model_client = model_client
-
- # Initialize embedding model
- self.embedding_model = SentenceTransformer(embedding_model)
-
- # Initialize FAISS index for vector similarity search
- self.index = IndexFlatL2(vector_dim)
-
- # Store document texts and their embeddings
- self.documents = []
- self.document_embeddings = []
-
- # Retrieval parameters
- self.top_k_retrieval = top_k_retrieval
-
- # Conversation history and context
- self.conversation_history = ""
- self.model_kwargs = model_kwargs
-
- def add_documents(self, documents: List[str]):
- """
- Add documents to the RAG pipeline's knowledge base
-
- Args:
- documents (List[str]): List of document texts to add
- """
- for doc in documents:
- # Embed document
- embedding = self.embedding_model.encode(doc)
-
- # Add to index and document store
- self.index.add(np.array([embedding]))
- self.documents.append(doc)
- self.document_embeddings.append(embedding)
-
- def retrieve_relevant_docs(self, query: str) -> List[str]:
- """
- Retrieve most relevant documents for a given query
-
- Args:
- query (str): Input query to find relevant documents
-
- Returns:
- List[str]: Top k most relevant documents
- """
- # Embed query
- query_embedding = self.embedding_model.encode(query)
-
- # Perform similarity search
- distances, indices = self.index.search(
- np.array([query_embedding]), self.top_k_retrieval
- )
-
- # Retrieve and return top documents
- return [self.documents[i] for i in indices[0]]
-
- def generate_response(self, query: str) -> str:
- """
- Generate a response using retrieval-augmented generation
-
- Args:
- query (str): User's input query
-
- Returns:
- str: Generated response incorporating retrieved context
- """
- # Retrieve relevant documents
- retrieved_docs = self.retrieve_relevant_docs(query)
-
- # Construct context-aware prompt
- context = "\n\n".join([f"Context Document: {doc}" for doc in retrieved_docs])
- full_prompt = f"""
- Context:
- {context}
-
- Query: {query}
-
- Generate a comprehensive and informative response that:
- 1. Uses the provided context documents
- 2. Directly answers the query
- 3. Incorporates relevant information from the context
- """
-
- # Prepare API arguments
- api_kwargs = self.model_client.convert_inputs_to_api_kwargs(
- input=full_prompt, model_kwargs=self.model_kwargs, model_type=ModelType.LLM
- )
-
- # Call API and parse response
- response = self.model_client.call(
- api_kwargs=api_kwargs, model_type=ModelType.LLM
- )
- response_text = self.model_client.parse_chat_completion(response)
-
- # Update conversation history
- self.conversation_history += f"\nQuery: {query}\nResponse: {response_text}"
-
- return response_text
-
-
-def run_rag_pipeline(model_client, model_kwargs, documents, queries):
- rag_pipeline = AdalflowRAGPipeline(
- model_client=model_client, model_kwargs=model_kwargs
- )
-
- rag_pipeline.add_documents(documents)
-
- # Generate responses
- for query in queries:
- print(f"\nQuery: {query}")
- response = rag_pipeline.generate_response(query)
- print(f"Response: {response}")
-
-
-def main():
- setup_env()
-
- # ajithvcoder's statements are added so that we can validate that the LLM is generating from these lines only
- documents = [
- "ajithvcoder is a good person whom the world knows as Ajith Kumar, ajithvcoder is his nick name that AjithKumar gave himself",
- "The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair.",
- "ajithvcoder likes Hyderabadi panner dum briyani much.",
- "The Louvre Museum in Paris is the world's largest art museum, housing thousands of works of art.",
- "ajithvcoder has a engineering degree and he graduated on May, 2016.",
- ]
-
- # Questions related to ajithvcoder's are added so that we can validate
- # that the LLM is generating from above given lines only
- queries = [
- "Does Ajith Kumar has any nick name ?",
- "What is the ajithvcoder's favourite food?",
- "When did ajithvcoder graduated ?",
- ]
-
- groq_model_kwargs = {
- "model": "llama-3.2-1b-preview", # Use 16k model for larger context
- "temperature": 0.1,
- "max_tokens": 800,
- }
-
- openai_model_kwargs = {
- "model": "gpt-3.5-turbo", # Use 16k model for larger context
- "temperature": 0.1,
- "max_tokens": 800,
- }
-
- # Below example shows that adalflow can be used in a genric manner for any api provider
- # without worrying about prompt and parsing results
- run_rag_pipeline(GroqAPIClient(), groq_model_kwargs, documents, queries)
- run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)
-
-
-if __name__ == "__main__":
- main()
diff --git a/tutorials/assets/documents/The Bioluminescent Guardians of the Desert Canyon.txt b/tutorials/assets/documents/The Bioluminescent Guardians of the Desert Canyon.txt
deleted file mode 100644
index 324f4b41..00000000
--- a/tutorials/assets/documents/The Bioluminescent Guardians of the Desert Canyon.txt
+++ /dev/null
@@ -1,6 +0,0 @@
-In a remote desert canyon, scientists discovered a colony of glowing insects called "Lunaflits." These
-creatures produce bioluminescence to attract mates and ward off predators. Unlike fireflies, Lunaflits
-emit a constant, soothing green light that illuminates the canyon at night. The canyon walls are covered
-with a rare moss that reflects their light, creating an ethereal glow. Researchers have found that Lunaflits
-communicate through rhythmic light pulses, forming intricate patterns. Local legends claim these insects
-are guardians of ancient treasure buried deep within the canyon.
diff --git a/tutorials/assets/documents/The Enigmatic Crystal Cavern of Lake Aurora.txt b/tutorials/assets/documents/The Enigmatic Crystal Cavern of Lake Aurora.txt
deleted file mode 100644
index 0b4929ab..00000000
--- a/tutorials/assets/documents/The Enigmatic Crystal Cavern of Lake Aurora.txt
+++ /dev/null
@@ -1,6 +0,0 @@
-Hidden beneath the icy waters of Lake Aurora lies the Crystal Cavern, a natural wonder discovered by divers
-in 1987. The cavern is adorned with shimmering quartz formations that refract sunlight into a spectrum of
-colors. It is said that the cavern once served as a sanctuary for an ancient civilization that revered the
-crystals as conduits to the spirit world. Explorers have recovered artifacts carved with intricate symbols,
-suggesting a deep connection to celestial events. However, accessing the cavern is perilous due to the lake's
-freezing temperatures and strong currents.
diff --git a/tutorials/assets/documents/The Legend of the Moonshade Willow.txt b/tutorials/assets/documents/The Legend of the Moonshade Willow.txt
deleted file mode 100644
index 36e342b3..00000000
--- a/tutorials/assets/documents/The Legend of the Moonshade Willow.txt
+++ /dev/null
@@ -1,6 +0,0 @@
-In the mystical village of Elmsworth, a rare tree known as the "Moonshade Willow" blooms once every seven
-years. Its blossoms emit a soft glow, and villagers believe that meditating under its branches brings vivid
-dreams of the future. The tree's bark is said to contain a secret resin used in ancient healing rituals. Elders
-claim the Moonshade Willow was a gift from a goddess to protect the village. Despite its sacred status,
-researchers have discovered that the tree thrives only in Elmsworth's unique soil, making it impossible to
-cultivate elsewhere.