Name		Name	Last commit message	Last commit date
parent directory ..
233-blip-convert.ipynb		233-blip-convert.ipynb
233-blip-convert.png		233-blip-convert.png
233-blip-optimize.ipynb		233-blip-optimize.ipynb
README.md		README.md
blip_model.py		blip_model.py
utils.py		utils.py

README.md

Visual Question Answering and Image Captioning using BLIP and OpenVINO

BLIP is a pre-training framework for unified vision-language understanding and generation, which achieves state-of-the-art results on a wide range of vision-language tasks. This tutorial considers ways to use BLIP for visual question answering and image captioning.

The complete pipeline of this demo is shown below:

Image Captioning

The following image shows an example of the input image and generated caption:

Visual Question Answering

The following image shows an example of the input image, question and answer generated by model

Notebook Contents

This folder contains two notebooks that show how to convert and optimize model with OpenVINO:

The first notebook consists of the following parts:

Instantiate a BLIP model.
Convert the BLIP model to OpenVINO IR.
Run visual question answering and image captioning with OpenVINO.

The second notebook consists of the following parts:

Download and preprocess dataset for quantization.
Quantize the converted vision and text encoder OpenVINO models from notebook with NNCF.
Compress weights of the OpenVINO text decoder model from notebook with NNCF.
Check the model result using the same input data from the notebook.
Compare model size of converted and optimized models.
Compare performance of converted and optimized models.

NNCF performs optimization within the OpenVINO IR. It is required to run the first notebook before running the second notebook.

Installation Instructions

This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

233-blip-visual-language-processing

233-blip-visual-language-processing

README.md

Visual Question Answering and Image Captioning using BLIP and OpenVINO

Image Captioning

Visual Question Answering

Notebook Contents

Installation Instructions

Files

233-blip-visual-language-processing

Directory actions

More options

Directory actions

More options

Latest commit

History

233-blip-visual-language-processing

Folders and files

parent directory

README.md

Visual Question Answering and Image Captioning using BLIP and OpenVINO

Image Captioning

Visual Question Answering

Notebook Contents

Installation Instructions