forked from gulcin/pgvector-rag-app
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
the app is replaced with aidb instead of pgvector
- Loading branch information
1 parent
b1cbbff
commit ead5a06
Showing
8 changed files
with
85 additions
and
118 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,18 @@ | ||
# pgvector-rag | ||
An application to demonstrate how can you make a RAG using pgvector and PostgreSQL | ||
# aidb-rag | ||
An application to demonstrate how can you make a RAG using EDB's aidb and PostgreSQL | ||
|
||
## Requirements | ||
- Python3 | ||
- PostgreSQL | ||
- pgvector | ||
- aidb | ||
|
||
## Install | ||
|
||
Clone the repository | ||
|
||
``` | ||
git clone [email protected]:gulcin/pgvector-rag.git | ||
cd pgvector-rag | ||
git clone [email protected]:gulcin/aidb-rag-app.git | ||
cd aidb-rag-app | ||
``` | ||
|
||
Install Dependencies | ||
|
@@ -31,10 +31,14 @@ cp .env-example .env | |
|
||
## Run | ||
|
||
First run your `aidb` extension by following the step by step installation guide: https://www.enterprisedb.com/docs/edb-postgres-ai/ai-ml/install-tech-preview/ | ||
|
||
Make sure your aidb extension is ready to accept connections. Then you can continue as follows: | ||
|
||
``` | ||
python app.py --help | ||
usage: app.py [-h] {create-db,import-data,chat} ... | ||
usage: app.py [-h] {create-db,import-data,chat} {data_source} | ||
Application Description | ||
|
@@ -49,34 +53,3 @@ Subcommands: | |
chat Use chat feature | ||
``` | ||
|
||
## Run UI | ||
|
||
We use Streamlit for creating a simple Graphical User Interface for our pgvector-rag app. | ||
|
||
To be able to run Streamlit please do the following: | ||
|
||
``` | ||
pip install streamlit | ||
``` | ||
|
||
**Add keys/secrets to Streamlit secrets** | ||
|
||
If you need to store secrets that Streamlit app will use, you can do this by creating | ||
`.streamlit/secrets.toml` file under Streamlit directory and adding lines like following: | ||
|
||
``` | ||
# .streamlit/secrets.toml | ||
OPENAI_API_KEY = "YOUR_API_KEY" | ||
``` | ||
**Run Streamlit app for generating UI** | ||
|
||
``` | ||
streamlit run chatgptui.py | ||
``` | ||
You can create as many apps you'd like and place them under Streamlit directory, | ||
edit the keys if needed and run them like described above. | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,22 @@ | ||
import numpy as np | ||
|
||
from db import get_connection | ||
from embedding import generate_embeddings, read_pdf_file | ||
|
||
|
||
def import_data(args, model, device, tokenizer): | ||
def import_data(args): | ||
data = read_pdf_file(args.data_source) | ||
|
||
embeddings = [ | ||
generate_embeddings(tokenizer=tokenizer, model=model, device=device, text=line) | ||
for line in data | ||
] | ||
|
||
conn = get_connection() | ||
cursor = conn.cursor() | ||
|
||
# Store each embedding in the database | ||
for i, (doc_fragment, embedding) in enumerate(embeddings): | ||
for i, (doc_fragment) in enumerate(data): | ||
cursor.execute( | ||
"INSERT INTO embeddings (id, doc_fragment, embeddings) VALUES (%s, %s, %s)", | ||
(i, doc_fragment, embedding[0]), | ||
"INSERT INTO documents (id, doc_fragment) VALUES (%s, %s)", | ||
(i, doc_fragment), | ||
) | ||
conn.commit() | ||
|
||
generate_embeddings() | ||
print( | ||
"import-data command executed. Data source: {}".format( | ||
args.data_source | ||
) | ||
) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,6 @@ psycopg2 | |
transformers | ||
torch | ||
black | ||
pgvector | ||
PyPDF2 | ||
bitsandbytes | ||
accelerate |