A Python web web app that searches for images matching a given text
First, create a virtual environment to keep package installation local to this directory
python3 -m venv venv
Enable it - this shows doing so for a normal Unix shell, there are other
scripts for (for instance) the fish
shell
source venv/bin/activate
Install the Python packages we need
python3 -m pip install -r requirements.txt
Note Sometimes we've seen the Python
clip.load
function fail to download the CLIP model, presumably due to the source server being busy. The code here will use a local copy of the model if it's available. To make that local copy:mkdir models curl https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt --output models/ViT-B-32.pt
Create your PostgreSQL® database. An Aiven for PostgreSQL service will do very well - see the Create a service section in the Aiven documentation.
Copy the template environment file
cp .env_example .env
Then edit the .env
file to insert the credentials needed to connect to the
database.
Note If you're using an Aiven for PostgreSQL service, then you want the Service URI value from the service Overview in the Aiven console. The result should look something like:
PG_SERVICE_URI=postgres://<user>:<password>@<host>:<port>/defaultdb?sslmode=require
Enable pgvector and set up the table we need in the database
./create_table.py
Calculate the embeddings for the pictures in the photos
directory, and
upload them to the database
./process_images
You can run find_images.py
to check that everything is working - it looks
for images matching the text man jumping
and reports their filenames
./find_images
Run the webapp locally using fastapi
fastapi dev app.py
Go to http://127.0.0.1:8000 in a web browser, and request a search.
Possible ideas include:
- cat
- man jumping
- outer space
The images in the photos
directory are the same as those used in Workshop: Searching for images with vector search - OpenSearch and CLIP model.
They came from Unsplash and have been reduced in size to make them fit within GitHub filesize limits for a repository.
-
The Workshop: Searching for images with vector search - OpenSearch and CLIP model which does (essentially) the same thing, but using OpenSearch and Jupyter notebooks.
-
Building a movie recommendation system with Tensorflow and PGVector which searches text, and produces a web app using JavaScript
For help understanding how to use HTMX
- Using HTMX with FastAPI
- and for help understanding how I wanted to use forms, Updating Other Content from the HTMX documentation (I went for option 1, as suggested).