Skip to content

Latest commit

 

History

History
73 lines (49 loc) · 3.76 KB

README.md

File metadata and controls

73 lines (49 loc) · 3.76 KB

Recommendation Engine

The primary objectives of this recommendation engine to demonstrate a semantic search capabilities using the aidb extension, with the intent of demonstrating the extension's ease of implementation and its capability to abstract complexities without compromising functionality. Sample Chat Console Output

Catalog Using Postgresql & aidb extension

The objective of this experiment is to leverage the CLIP model in conjunction with PostgreSQL, applying the aidb extension execute transformation and semantic search in an automated fashion withing database. This setup involves a dataset of Images of catalog that will allow reverse image search too.

aidb is EDB Postgres AI database extension and it should be installed. If not installed please install it by following the step by step installation guide in the following link: https://www.enterprisedb.com/docs/edb-postgres-ai/ai-ml/install-tech-preview/

Instead of storing the images directly in the database, we store them in a public S3 Bucket. The actual data stored in the database consists of image embeddings, which are generated by the CLIP model and encapsulated in 512-dimensional vectors as required by the model. This approach enables rapid search capabilities on a standard laptop.

The demo is not only do reverse image search it also showes text to image search, searching on catalog passing text as input.

Sample Dataset

Download and unzip the dataset from https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-small/download?datasetVersionNumber=1 into a folder like following dataset/images

Install requirements;

Run pip install from EDB Python directory as: pip install -r requirements.txt

Python Environment: The Python environment accessible to PostgreSQL should have the necessary libraries installed:

Run

You can run aidb recommendation app by running the below python script to initialize the database and to load the data. Or you can run the aidb queries inside Postgres Terminal.

To run as python script

The images should be stored into that S3 bucket to run the python script. S3 endpoint is optional leave blank if the s3 bucket is not public. Then you should pass the name of the S3 bucket name as an argument like in below;

% python code/connect_encode.py retriver_name s3_bucket_name s3_endpoint

Example script for S3 bucket is public;

% python code/connect_encode.py recommendation_engine public-ai-team http://s3.eu-central-1.amazonaws.com

Example query to run within PostgreSQL Terminal

postgres=# SELECT aidb.create_s3_retriever(
                'recommendation_engine',
                'public', 
                'clip-vit-base-patch32',
                'img',
                'public-ai-team',
                '',
                'http://s3.eu-central-1.amazonaws.com'
            );

postgres=# SELECT aidb.refresh_retriever('recommendation_engine');

Similarity Search using Streamlit application Catalog Search and Free Text Search on Catalog.

Change the db connection with the necessary port, username, password from create_db_connection function and DATABASE_URL variable.

To run with aidb use the below code. s3_endpoint is optional Streamlit doesn't natively support command-line arguments in the same way as typical Python scripts. Therefore enter single quoted empty string '' if the s3 bucket is not public. Otherwise run the script as demonstrated in the below example;

% streamlit run code/app_search_aidb.py retriever_name s3_bucket_name s3_endpoint

Example if the S3 bucket is public ;

% streamlit run code/app_search_aidb.py recommendation_engine public-ai-team http://s3.eu-central-1.amazonaws.com

Example search texts : red shoes, red women shoes, black dress....