Skip to content

Using Vertex Matching Engine to solve vector similarity search problems

Notifications You must be signed in to change notification settings

muratuysal/vector-similarity-w-vertex-ai

 
 

Repository files navigation

vector-similarity-w-vertex-ai

end-to-end visual similarity solution using Vertex Matching Engine

TODOs:

  • (1) Multi-modal features e.g., image and text (product title, description) for embeddings

Getting value out of unstructured data with embeddings

  • Neural deep retrieval (NDR) is a popular technique for representing the relationships between multiple entities, and indexing these entities for efficient retrieval

  • With applications across a variety of use cases such as multimodal search, item matching, ad targeting, customer segmentation, recommendations, and more, it's a valuable capability many organizations have prioritized.

alt text

See why-ann-index.md for a refresher on NDR and ANN indexes


Repo Objectives

  1. Use a pretrained deep learning model to extract feature vectors (embeddings) from each image in a retail product catalog

  2. Store embedding vectors in a scalable approximate nearest neighbor (ANN) index, e.g., Vertex Matching Engine, where each image's embedding vectors are indexed by product ID

  3. For a given query image, call model.predict(x) with the same pretrained model used in (1) to extract the feature vectors (embeddings) from the query image

  4. Using the computed feature vectors from (3), query the Matching Engine Index to find the k nearest neighbors

alt text


Deployment pipelines

alt text

Load / pre-process images

  • decode
  • reshape per model specs
  • Convert tensor to float & add axis for expected model input (e.g., 1 x 224 x 224 x 3)

Extract Feature Vectors

  • Load pre-trained image model (TF Hub)
  • Loop through images & calculate feature vectors (embeddings)
  • Save vectors to file in Cloud Storage

Build Matching Engine Index

Feature Extraction

  • Pre-trained models trained on larger datasets can be a good starting point.
  • If the original dataset is large and general enough, the spatial hierarchy of features learned by the pretrained network can effictevly act as a generic modelof the vidual world
  • They can be useful even if the image classes are completely different between the original and target dataset
  • Feature Extraction consists of taking the convolutional base of a previously trained network, running new data through it, and training a new classifier on top of the output

About

Using Vertex Matching Engine to solve vector similarity search problems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%