Skip to content

Commit

Permalink
Merge pull request #56 from SylphAI-Inc/li
Browse files Browse the repository at this point in the history
Li
  • Loading branch information
liyin2015 authored Jun 29, 2024
2 parents 6d53843 + 75ab0ea commit b7fa9ec
Show file tree
Hide file tree
Showing 5 changed files with 158 additions and 87 deletions.
44 changes: 41 additions & 3 deletions developer_notes/generator.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,48 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": []
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"GeneratorOutput(data='LightRAG is a light-based Real-time Anomaly Generator, which is a special type of anomaly detection system. It uses a combination of visual and statistical techniques to detect unusual patterns or outliers in a dataset in real-time, often for purposes such as identifying security threats, detecting fraud, or monitoring system performance. Would you like to know more about its applications or how it works?', error=None, usage=None, raw_response='LightRAG is a light-based Real-time Anomaly Generator, which is a special type of anomaly detection system. It uses a combination of visual and statistical techniques to detect unusual patterns or outliers in a dataset in real-time, often for purposes such as identifying security threats, detecting fraud, or monitoring system performance. Would you like to know more about its applications or how it works?')\n"
]
}
],
"source": [
"from lightrag.core import Component, Generator, Prompt\n",
"from lightrag.components.model_client import GroqAPIClient\n",
"from lightrag.utils import setup_env\n",
"\n",
"\n",
"class SimpleQA(Component):\n",
" def __init__(self):\n",
" super().__init__()\n",
" template = r\"\"\"<SYS>\n",
" You are a helpful assistant.\n",
" </SYS>\n",
" User: {{input_str}}\n",
" You:\n",
" \"\"\"\n",
" self.generator = Generator(\n",
" model_client=GroqAPIClient(), model_kwargs={\"model\": \"llama3-8b-8192\"}, template=template\n",
" )\n",
"\n",
" def call(self, query):\n",
" return self.generator({\"input_str\": query})\n",
"\n",
" async def acall(self, query):\n",
" return await self.generator.acall({\"input_str\": query})\n",
"\n",
"\n",
"qa = SimpleQA()\n",
"answer = qa(\"What is LightRAG?\")\n",
"\n",
"print(answer)"
]
}
],
"metadata": {
Expand Down
30 changes: 30 additions & 0 deletions developer_notes/generator_note.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from lightrag.core import Component, Generator
from lightrag.components.model_client import GroqAPIClient
from lightrag.utils import setup_env # noqa


class SimpleQA(Component):
def __init__(self):
super().__init__()
template = r"""<SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
"""
self.generator = Generator(
model_client=GroqAPIClient(),
model_kwargs={"model": "llama3-8b-8192"},
template=template,
)

def call(self, query):
return self.generator({"input_str": query})

async def acall(self, query):
return await self.generator.acall({"input_str": query})


qa = SimpleQA()
answer = qa("What is LightRAG?")
print(qa)
39 changes: 21 additions & 18 deletions docs/source/developer_notes/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,11 @@ Developer Notes
Learn LightRAG design phisolophy and the `why` and `how-to` (customize and integrate) behind each core part within the LightRAG library.
This is our tutorials before you move ahead to build use cases (LLM applications) end to end.

.. note::
.. raw::
You can read interchangably between :ref:`Use Cases <use_cases>`.
.. note::
You can read interchangably between :ref:`Use Cases <use_cases>`.
Expand All @@ -26,26 +28,26 @@ This is our tutorials before you move ahead to build use cases (LLM application
.. :width: 200px
LightRAG library focus on providing building blocks for developers to **build** and **optimize** the `task pipeline`.
We have clear design phisolophy:
We have clear design phisolophy: :doc:`lightrag_design_philosophy`.



.. toctree::
:maxdepth: 1
:caption: Introduction
.. :maxdepth: 1
.. :hidden:
lightrag_design_philosophy
.. lightrag_design_philosophy
llm_intro
.. llm_intro
Building
=============================
Building
-------------------
Base classes
---------------
~~~~~~~~~~~~~~~~~~~~~~
Code path: ``lightrag.core``.

.. list-table::
Expand Down Expand Up @@ -73,9 +75,10 @@ Code path: ``lightrag.core``.
base_data_class

RAG Essentials
-------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RAG components
~~~~~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^^


Code path: ``lightrag.core``. For abstract classes:

Expand All @@ -100,8 +103,8 @@ Code path: ``lightrag.core``. For abstract classes:
* - :doc:`retriever`
- The base class for all retrievers who in particular retrieve relevant documents from a given database to add **context** to the generator.

Data, Processing, and storage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Data Pipeline and Storage
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Data Processing: including transformer, pipeline, and storage. Code path: ``lightrag.components.data_process``, ``lightrag.core.db``, and ``lightrag.database``.
Components work on a sequence of ``Document`` and return a sequence of ``Document``.
Expand Down Expand Up @@ -139,7 +142,7 @@ Components work on a sequence of ``Document`` and return a sequence of ``Documen


Agent Essentials
-----------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Agent in ``components.agent`` is LLM great with reasoning, planning, and using tools to interact and accomplish tasks.

.. list-table::
Expand Down Expand Up @@ -186,7 +189,7 @@ Agent in ``components.agent`` is LLM great with reasoning, planning, and using t
Optimizing
=============================
-------------------

Datasets and Evaulation

Expand All @@ -213,7 +216,7 @@ Optimizer & Trainer


Logging & Tracing & Configurations
=============================
------------------------------------
Code path: ``lightrag.utils``.

.. list-table::
Expand Down
130 changes: 65 additions & 65 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,13 @@
.. LightRAG documentation master file, created by
sphinx-quickstart on Thu May 9 15:45:29 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
=======================
LightRAG Home
Introduction
=======================


LightRAG is the "PyTorch" library for building large langage model(LLM) applications. It is super light, modular and robust like "PyTorch", and offers essential components for `Retriever`-`Agent`-`Generator` (RAG).
LightRAG is the "PyTorch" library for building large langage model(LLM) applications. We help developers on both building and optimimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
It is light, modular, and robust.

You have a similar coding experience as PyTorch. Here is a side to side comparison of writing a PyTorch module and a LightRAG component:

.. grid:: 2
.. grid:: 1
:gutter: 1

.. grid-item-card:: PyTorch
Expand Down Expand Up @@ -43,84 +39,90 @@ You have a similar coding experience as PyTorch. Here is a side to side comparis

.. code-block:: python
from core.component import Component, Generator
from components.model_client import OpenAIClient
from lightrag.core import Component, Generator
from lightrag.components.model_client import GroqAPIClient
from lightrag.utils import setup_env
class SimpleQA(Component):
def __init__(self):
super().__init__()
template = r"""<SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
"""
self.generator = Generator(
model_client=OpenAIClient(),
model_kwargs={'model': 'gpt-3.5-turbo'}
model_client=GroqAPIClient(),
model_kwargs={"model": "llama3-8b-8192"},
template=template,
)
def call(self, query):
return self.generator.call({'input_str': query})
return self.generator({"input_str": query})
async def acall(self, query):
return await self.generator.acall({'input_str': query})
qa = SimpleQA()
print(qa)
**Why LightRAG?**


1. **Clarity and Simplicity**

We understand that developers building real-world Large Language Model (LLM) applications are the real heroes. Just like AI researchers and engineers who build models on top of PyTorch, developers require **Maximum Flexibility and Customizability**: Each developer has unique data needs to build their own models/components, experiment with In-context Learning (ICL) or model finetuning, and deploy the LLM applications to production. This means the library must provide fundamental lower-level building blocks and strive for clarity and simplicity:
return await self.generator.acall({"input_str": query})
- We maintain no more than two levels of subclasses.
- Each core abstract class is designed to be robust and flexible.
- We use 10X less code than other libraries to achieve 10X more robustness and flexibility.
qa = SimpleQA()
answer = qa("What is LightRAG?")
2. **Control and Transparency**

Coming from a deep AI research background, we understand that the more control and transparency developers have over their prompts, the better. In default:

- LightRAG simplifies what developers need to send to LLM proprietary APIs to just two messages each time: a `system message` and a `user message`. This minimizes reliance on and manipulation by API providers.
- LightRAG provides advanced tooling for developers to build `agents`, `tools/function calls`, etc., without relying on any proprietary API provider's 'advanced' features such as `OpenAI` assistant, tools, and JSON format.

3. **Suitted for Both Researchers and Production Engineers**
On top of the easiness to use, we in particular optimize the configurability of components for researchers to build their solutions and to benchmark existing solutions.
Like how PyTorch has united both researchers and production teams, it enables smooth transition from research to production.
With researchers building on LightRAG, production engineers can easily take over the method and test and iterate on their production data.
Researchers will want their code to be adapted into more products too.
Why LightRAG?


Clarity and Simplicity
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**LightRAG vs other LLM libraries:**

We understand that developers building real-world Large Language Model (LLM) applications are the real heroes. Just like AI researchers and engineers who build models on top of PyTorch, developers require **Maximum Flexibility and Customizability**: Each developer has unique data needs to build their own models/components, experiment with In-context Learning (ICL) or model finetuning, and deploy the LLM applications to production. This means the library must provide fundamental lower-level building blocks and strive for clarity and simplicity:

**LightRAG library structures as follows:**
- We maintain no more than two levels of subclasses.
- Each core abstract class is designed to be robust and flexible.
- We use 10X less code than other libraries to achieve 10X more robustness and flexibility.

#TODO: One diagram to make people understand lightrag faster
Here is the printed out structure of ``qa``:

* `core` - Base abstractions, core functions, and core components like `Generator` and `Embedder` to support more advanced components.
* `components` - Components that are built on top of the core directive. Users will install relevant depencides on their own for some components.
.. code-block::
SimpleQA(
(generator): Generator(
model_kwargs={'model': 'llama3-8b-8192'},
(prompt): Prompt(
template: <SYS>
You are a helpful assistant.
</SYS>
User: {{input_str}}
You:
, prompt_variables: ['input_str']
)
(model_client): GroqAPIClient()
)
)
**LightRAG documentation is divided into two parts:**
Control and Transparency
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* **Developer Documentation**: This documentation explains how LightRAG is designed in more depth and is especially useful for developers who want to contribute to LightRAG.
Coming from a deep AI research background, we understand that the more control and transparency developers have over their prompts, the better. In default:

* **User Documentation**: This documentation is for users who want to use LightRAG to build their applications.
- LightRAG simplifies what developers need to send to LLM proprietary APIs to just two messages each time: a `system message` and a `user message`. This minimizes reliance on and manipulation by API providers.
- LightRAG provides advanced tooling for developers to build `agents`, `tools/function calls`, etc., without relying on any proprietary API provider's 'advanced' features such as `OpenAI` assistant, tools, and JSON format

We encourage all users to at least skim through the developer documentation. Different from "PyTorch" where a normal user does not have to customize a building module for neural network,
LLM applications have much bigger scope and varies even more to different product environments, so developers customizing components on their own is much more common.
It is the future of LLM applications
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On top of the easiness to use, we in particular optimize the configurability of components for researchers to build their solutions and to benchmark existing solutions.
Like how PyTorch has united both researchers and production teams, it enables smooth transition from research to production.
With researchers building on LightRAG, production engineers can easily take over the method and test and iterate on their production data.
Researchers will want their code to be adapted into more products too.


.. toctree::
:glob:
:maxdepth: 1
:caption: New Users

:hidden:

get_started/index

Expand All @@ -130,35 +132,32 @@ LLM applications have much bigger scope and varies even more to different produc
.. toctree::
:glob:
:maxdepth: 1
:caption: Tutorials - How each part works
:hidden:

developer_notes/index




.. :caption: Tutorials - How each part works
.. :hidden:


.. toctree::
:maxdepth: 1
:caption: Use Cases - How different parts are used to build various LLM applications
:hidden:

tutorials/index


.. toctree::
:glob:
:maxdepth: 1
:caption: API Reference
:hidden:

apis/index

.. todo::

.. toctree::
:maxdepth: 1
:caption: Benchmarks
.. :caption: Benchmarks

Manually add documents for the code in benchmarks
.. Manually add documents for the code in benchmarks


.. :glob:
Expand All @@ -172,5 +171,6 @@ LLM applications have much bigger scope and varies even more to different produc
:glob:
:maxdepth: 1
:caption: For Contributors
:hidden:

contributor/index
Loading

0 comments on commit b7fa9ec

Please sign in to comment.