Skip to content

Getting to understand efficient inference with ONNX models for practical applications and pipelines

License

Notifications You must be signed in to change notification settings

shamith2/onnxInference

Repository files navigation

onnxInference: Inference with ONNX Models for AI Applications

Getting to understand efficient inference with ONNX models for practical applications and pipelines

  • onnxHelpers/onnxBenchmark.py = script to convert pytorch model to onnx, quantize fp32 onnx models to int8, and run benchmark inference on AMD Ryzen AI processor

Custom AI Recall Pipeline:

  • Implemented custom AI Recall feature, similar to Microsoft Windows AI Recall feature, running locally with Phi-3 Vision model for describing/analysing screenshots and Phi-3 Mini model to rename the screenshots based on the image description geneated by the vision model.

  • The filenames and descriptions (after chunking) are stored in a simple database for Retrieval-Augmented Generation (RAG). Based on a query, given by the user, the descriptions, along with the associated filenames of the screenshots, that are similar to the query are retrieved. The Phi-3 models have been tested on the CPU

  • Example Run 1:

    airecall_1

  • Best Result:

    best_result

  • Once, the descriptions are added into the database, subsequent retrivals are quick (test screenshots and database saved in results/aiRecall/snapshots; these screenshots are not very diverse)

    ai_recall_2

Stable Diffusion Pipeline:

  • Scripts to run stable diffusion pipeline, currently on running on DirectML-supported devices

image

About

Getting to understand efficient inference with ONNX models for practical applications and pipelines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published