Task | Inference result demo |
---|---|
Zero-shot subject-driven generation | |
Controlled subject-driven generation (Canny-edge) | |
Controlled subject-driven generation (Scribble) |
BLIP-Diffusion is a text-to-image diffusion model with built-in support for multimodal subject-and-text condition. BLIP-Diffusion enables zero-shot subject-driven generation, and efficient fine-tuning for customized subjects with up to 20x speedup. In addition, BLIP-Diffusion can be flexibly combined with ControlNet and prompt-to-prompt to enable novel subject-driven generation and editing applications.
The tutorial consists of the following steps:
- Prerequisites
- Load the model
- Infer the original model
- Zero-Shot subject-driven generation
- Controlled subject-driven generation (Canny-edge)
- Controlled subject-driven generation (Scribble)
- Convert the model to OpenVINO Intermediate Representation (IR)
- QFormer
- Text encoder
- ControlNet
- UNet
- Variational Autoencoder (VAE)
- Select inference device
- Inference
- Zero-Shot subject-driven generation
- Controlled subject-driven generation (Canny-edge)
- Controlled subject-driven generation (Scribble)
- Interactive inference
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to Installation Guide.