-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
af86c1a
commit 722e523
Showing
17 changed files
with
1,286 additions
and
42 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,54 @@ | ||
# RAG knowledge base script | ||
# Knowledge Base Ingestion | ||
|
||
The RAG (Retrieval-Augmented Generation) Knowledge Base Script is designed to automate the process of ingesting documents into a knowledge base, converting them into a vectorized format, and storing this information on the Galadriel chain. This script simplifies the process of creating a decentralized, blockchain-based knowledge base that can be queried and utilized by AI models for enhanced information retrieval and decision-making processes. | ||
|
||
## Features | ||
|
||
- **Document Ingestion:** Automatically ingest multiple documents from a specified directory. | ||
- **Vectorization:** Convert textual information into a vector format suitable for AI models. | ||
- **Blockchain Integration:** Seamlessly integrates with the Galadriel L1 chain, leveraging its Oracle system for decentralized storage and retrieval. | ||
- **IPFS Support:** Uses IPFS (InterPlanetary File System) for secure and distributed document storage. | ||
|
||
## Prerequisites | ||
|
||
Before you begin, ensure you have the following: | ||
- Python 3.11 or later installed on your system. | ||
- A funded wallet and corresponding private key | ||
- An API key for `nft.storage` to facilitate document uploading to IPFS. You can obtain this key by registering at [nft.storage](https://nft.storage). | ||
|
||
## Setup | ||
|
||
```shell | ||
python -m venv venv | ||
source venv/bin/activate | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## How to use | ||
|
||
```shell | ||
% python add_knowledge_base.py -d data | ||
[Loading 5 files from data.] | ||
Processing Files: 100%|███████████████████████| 5/5 [00:02<00:00, 1.68file/s] | ||
Generated 5 documents from 5 files. | ||
Uploading documents, please wait... | ||
Uploaded collection to IPFS. | ||
Requesting indexing, please wait... | ||
Waiting for indexing to complete... | ||
Collection indexed, index CID bafkreihgkrl7udanqbvkcig46xgp3dzj2mo5qidyvxmj7gga4vt5mdcu5m. | ||
Use CID `bafkreic2ft2wzozti3kpyilyjk4f5peirzdia7phmvicva6bbdwtjkx5ny` in your contract to query the indexed collection. | ||
``` | ||
To set up your environment for running the RAG Knowledge Base Script, follow these steps: | ||
|
||
1. Clone the repository to your local machine. | ||
2. Create a virtual environment for Python dependencies: | ||
```shell | ||
python -m venv venv | ||
source venv/bin/activate | ||
``` | ||
3. Install the required Python packages: | ||
```shell | ||
pip install -r requirements.txt | ||
``` | ||
4. Create a `.env` file in the root directory of the script and add your `nft.storage` API key and wallet private key as follows: | ||
```plaintext | ||
PRIVATE_KEY=your_wallet_private key | ||
NFT_STORAGE_API_KEY=your_api_key_here | ||
``` | ||
|
||
## How to Use | ||
|
||
1. Place your document files in a designated directory. The script can process multiple files in a batch. | ||
2. Run the script with the necessary arguments. For example, to ingest documents from the `galadriel_docs` directory, set a chunk size of 1500, and specify an oracle fee of 200: | ||
``` | ||
% python add_knowledge_base.py -d galadriel_docs -s 1500 -o 200 | ||
[Loading 13 files from galadriel_docs.] | ||
Processing Files: 100%|███████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:02<00:00, 4.70file/s] | ||
Generated 47 documents from 13 files. | ||
Uploading documents to IPFS, please wait...done. | ||
Requesting indexing, please wait...done. | ||
Knowledge base indexed, index CID `bafybeib36x56l7hgu4k47msj3bpxf4rfwlyu4xwt4m3jptlvo5litaar4q`. | ||
Use CID `bafkreifrqfc7apfnvd2legjygs2woutrfartm7dabmdvdfs7fy2veehhli` in your contract to query the indexed knowledge base. | ||
``` | ||
3. Follow the command-line instructions as the script processes the documents, uploads them to IPFS, and requests indexing on the Galadriel L1 chain. | ||
4. Upon completion, the script will provide you with a CID (Content Identifier) that can be used in your smart contracts or applications to query the indexed collection. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# What is Galadriel? | ||
|
||
Galadriel is the first L1 for AI. | ||
|
||
**Ethereum enabled writing smart contracts to build dApps. Similarly, Galadriel enables developers to build AI apps & agents like smart contracts -- decentralized and on-chain.** We support a range of AI usage: from simple LLM features in existing dApps to highly capable AI agents like on-chain AI hedge funds, in-game AI NPCs and AI-generated NFTs. | ||
|
||
Galadriel is built on a parallel-execution EVM stack which enables high throughput and low latency while providing a familiar experience to Solidity developers. It brings AI inference on-chain in a low-cost, low-latency manner through teeML (Trusted Execution Environment Machine Learning) which allows querying open and closed-source LLM models in a verifiable way. | ||
|
||
A purpose-built L1 is necessary to enable decentralized AI apps which leverage both the native capabilities of AI and Web3 rails. Here's a high-level overview of Galadriel L1 stack: | ||
|
||
<img src="/images/stack.jpg"/> | ||
|
||
|
||
Note that all of the functionality in the stack is not live as Galadriel is in an early devnet stage. | ||
|
||
# What problem does Galadriel solve? | ||
|
||
**Today, all AI applications you use are centralized. You lack ownership and don't have any say in their governance.** | ||
|
||
First, as a developer building AI apps, the only option is to deploy them on a centralized stack. During the last OpenAI Dev Day in 2023 November, many developers felt rug-pulled as OpenAI bundled their products into Custom GPTs and the Assistant API. Using any centralized AI provider has platform risk: the developer competes against the platform and risks getting their app shut down or taxed into oblivion. In 2015, Ethereum gave developers a way to build without this platform risk. Galadriel aims to do the same for building AI apps where developers are the masters of their destiny. | ||
|
||
Second, while AI is becoming the biggest wealth creator throughout history, if it is owned by just a few Web2 monopolies like today, then the most likely outcome is extreme wealth disparity because of their extractive economic model. It is already happening today - OpenAI's last valuation was $80B but all of the gains have flowed to less than a 1,000 people -- the team and investors. AI should be owned by everyone, not by a few large companies. | ||
|
||
Third, the fate of 8 billion people is in the hands of less than 100 people - the boards of OpenAI, Microsoft, Google, and a few other companies. Would you trust your or your family's life in the hands of Sam Altman? This is the broken safety model we have today: these companies say we can trust them to govern superintelligence once they invent it. Meanwhile, the man that promises to bring the world safe AGI can't even govern his own company, as shown by the board hiatus. History has repeatedly shown that security through obscurity never works. Web3 offers the most safe infrastructure the humanity has invented so far; the proof is in the pudding as Bitcoin secures $1.3T and Ethereum 478B+ dollars as of March 2024. Galadriel aims to bring large-scale deployment of AI on Web3 rails which can be governed in a transparent, democratic way. | ||
|
||
**Galadriel's mission is to enable safe, citizen-owned AI by bringing devs the Ethereum of AI.** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
export const oracleAddress = '0xe75D9bED71F98595aF51afd291703a7D4f35FB04'; | ||
export const nonEnclaveOracleAddress = '0xe75D9bED71F98595aF51afd291703a7D4f35FB04'; | ||
|
||
export const chatgptAddress = '0xcA734B83956Ac2cF835a1d243e08A2157d72FF9e'; | ||
export const dalleAddress = '0x9FA41Fcde39DaE56b0ABc211C5C5cF0D356459E8'; | ||
export const vitailikAddress = '0x9CC0b34c727f60b80dBB8550B7D53Fb21497596B'; | ||
export const agentAddress = '0xB3220DcFd5e6f951633aCcE3a40974F870680846'; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Currently the only block explorer on Galadriel is [explorer.galadriel.com](https://explorer.galadriel.com), provided by the Galadriel team. | ||
|
||
<Frame> | ||
<img src="/images/block-explorer.png"/> | ||
</Frame> | ||
|
||
Note: the explorer may be a few minutes behind the chain, and has been reported to occasionally not showing contracts that have in fact been deployed. We are working to improve this. | ||
|
||
The block explorer allows you to do the following and more. | ||
|
||
- **View transactions**: See individual transactions included in each block. | ||
- **Explore blocks**: Browse and inspect individual blocks, including details like block height, size, and timestamp. | ||
- **Search**: Search for specific transactions, blocks, addresses, or tokens using their unique identifiers. | ||
- **See network statistics**: Access real-time and historical data on the network. | ||
- **Decode raw data**: Interpret raw transaction or block data. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
To make any transactions on our devnet, you need devnet tokens. These are used only for the purpose of development and have no real value. | ||
|
||
Currently, the only way to receive those is via Discord. This is done to prevent abuse and make sure every developer has access to devnet tokens. | ||
|
||
The faucet currently dispenses 1 GAL, allowing for about 150-200 calls to the oracle. | ||
|
||
### To get testnet tokens: | ||
|
||
<Steps> | ||
<Step title="Join Discord"> | ||
Join our [Discord server](https://discord.com/invite/bHnFgSTKrP) and go to the `#devnet-faucet` channel. | ||
</Step> | ||
<Step title="Post request"> | ||
Post a message in the `#devnet-faucet` channel following this exact format, replacing `<address>` with your EVM wallet address. | ||
|
||
``` | ||
!faucet <address> | ||
``` | ||
</Step> | ||
</Steps> | ||
|
||
In a few seconds, you should receive the devnet tokens in your wallet. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Here's a list of main features supported by the Galadriel oracle. We are constantly adding to this list and suggestions are welcome - please join our Discord to discuss! | ||
|
||
### LLMs | ||
|
||
| Provider | Description | | ||
| ------------------------------------------------- | ----------------------------------------------------- | | ||
| [OpenAI](https://platform.openai.com/docs/models) | Models supported: GPT-4-turbo, GPT-3.5-turbo. | | ||
| [Groq](https://console.groq.com/docs/models) | Models supported: Llama2-70B, Mixtral-8x7B, Gemma-7B. | | ||
|
||
|
||
### Tools | ||
|
||
| Tool | Description | | ||
| ---------------- | ------------------------------------------------------------------------- | | ||
| Image generation | Generate images from text, using [DALL-E 3](https://openai.com/dall-e-3). | | ||
| Google search | Search the web using Google, via SERP API. | | ||
|
||
|
||
|
Oops, something went wrong.