Schedulers for Stable Diffusion Inference

1. Introduction

Mindone contains multiple schedule functions for the diffusion process.

The schedule functions take in the output of a pre-trained model, a sample which the diffusion process is iterating on, and a timestep to return a denoised sample. Schedulers define the method for iteratively adding noise to an image or for updating a sample based on model outputs (removing noise). Schedulers are often defined by a noise schedule and an update rule to solve the differential equation solution.

2. Summary of Schedulers

Mindone implements 5 different schedulers in addition to the DDPM scheduler. The following table summarizes these schedulers:

Scheduler	Reference
DDPM	Denoising Diffusion Probabilistic Models
DDIM	Denoising Diffusion Implicit Models
PLMS	Pseudo Numerical Methods for Diffusion Models on Manifolds
DPM-Solver	DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
DPM-Solver++	DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
UniPC	UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models

3. Inference with SD2.0

3.1 Quick Start

Normally, you can test the stable diffusion model using the following command with the default DPM-Solver++ scheduler (Refer to Stable Diffusion 2.0-Inference).

# Text to image generation with SD2.0
python text_to_image.py --prompt "A wolf in winter" --version 2.0

You can obtain diverse results according to the given prompt. Here are 5 examples:

DPM-Solver++ #1	DPM-Solver++ #2	DPM-Solver++ #3	DPM-Solver++ #4	DPM-Solver++ #5

3.2 Inference with Different Schedulers

As quantitative evaluation for different schedulers is usually insufficient to determine which is the best, it is often recommended to simply try them out and visually compare the results. In this chapter, we will demonstrate how to generate images using different schedulers, and show you the visual comparison of images, the time required for each scheduler for reference.

3.2.1 Commands of Inference with Different Schedulers

You can test the stable diffusion model with different scheduler using the following commands. The ${scheduler} can be one of ddim, plms, dpm_solver and uni_pc (note: the default scheduler is DPM-Solver++). The optimal Hyperparameters can vary for different schedulers. For example, the optimal ${sampling_steps} is 50 for PLMS, DDIM and 20 for UniPC,DPM-Solver, DPM-Solver++.

# Scheduler PLMS: set ${scheduler} to plms and ${sampling_steps} to 50
# Scheduler DDIM: set ${scheduler} to ddim and ${sampling_steps} to 50
# Scheduler DPM-Solver: set ${scheduler} to dpm_solver and ${sampling_steps} to 20
# Scheduler DPM-Solver++: set ${sampling_steps} to 20  (Note: in this case, no need to pass the ${scheduler} argument)
# Scheduler UniPC: set ${scheduler} to uni_pc and ${sampling_steps} to 20
# ${prompt} be "A Van Gogh style oil painting of sunflower" as default.
python text_to_image.py \
    --prompt ${prompt} \
    --config configs/v2-inference.yaml \
    --version 2.0 \
    --output_path ./output/ \
    --seed 42 \
    --${scheduler} \
    --n_iter 8 \
    --n_samples 1 \
    --W 512 \
    --H 512 \
    --sampling_steps ${sampling_steps}

Note that, If you set big sampling steps for UniPC schedulers, the program will report a warning such as The selected sampling timesteps are not appropriate for UniPC sampler. The default ${prompt} is "A Van Gogh style oil painting of sunflower".

3.2.2 Visual Comparison

Now, let's execute the command above to generate some images with different ${prompt} and different ${scheduler}. All these images have a default resolution of 512x512.

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py  --prompt "A Van Gogh style oil painting of sunflower"  --config configs/v2-inference.yaml  --version 2.0  --seed 42  --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py  --prompt "A photo of an astronaut riding a horse on mars"  --config configs/v2-inference.yaml  --version 2.0  --seed 42  --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py  --prompt "A high tech solarpunk utopia in the Amazon rainforest"  --config configs/v2-inference.yaml  --version 2.0  --seed 42  --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py  --prompt "The beautiful night view of the city has various buildings, traffic flow, and lights"  --config configs/v2-inference.yaml  --version 2.0  --seed 42  --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py  --prompt "A pikachu fine dining with a view to the Eiffel Tower"  --config configs/v2-inference.yaml  --version 2.0  --seed 42  --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

3.2.3 Time Comparison

We made the following table to compare the time required for different schedulers (with their optimal sampling steps) to generate images by executing the command above on Ascend 910. This means that the image resolution is 521x512, n_iter is 8, n_samples is 1(calculate the average of the last 7 iterations). Please note that if n_samples is increased (e.g., 8), the time required for each sample will decrease.

scheduler	sampling_steps	time(second/image)
ddim	50	16.48s
plms	50	16.77s
dmp_solver	20	12.72s
dmp_solver_pp	20	13.43s
uni_pc	20	14.97s

4. Inference (based on LoRA)

The difference schedulers can be used for stable diffusion model + LoRA inference (see LoRA for more information). Users can specify schedulers in the same manner as described in chapter Inference with Different Schedulers. For detailed additional information, please refer to Use LoRA for Stable Diffusion Finetune. In this chapter, we will provide a visual, qualitative and quantitative comparison with Diffusers using different schedulers based on LoRa finetune.

4.1 Visual Comparison

Based on the LoRA models trained on pokemon and chinese_art datasets (see LoRA for more information), we test them using different schedulers. The base model is Stable Diffusion 2.0.

pokemon dataset:

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
# Set ${prompt} to "a drawing of a blue and white cat with big eyes"
bash scripts/run_test_to_image_v2_lora.sh

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
# Set ${prompt} to "a cartoon of a black and white pokemon"
bash scripts/run_test_to_image_v2_lora.sh

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

chinese_art dataset:

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
# Set ${prompt} to "a painting of a group of people sitting on a hill with trees in the background and a stream of water"
bash scripts/run_test_to_image_v2_lora.sh

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
# Set ${prompt} to "a drawing of a village with a boat and a house in the background with a red ribbon on the bottom of the picture"
bash scripts/run_test_to_image_v2_lora.sh

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

4.2 Qualitative Comparison with Diffusers

We also show some text-to-image generation samples for the LoRA models trained by MindOne and Diffusers. The base model is Stable Diffusion 2.0.

pokemon dataset:

${prompt}="a drawing of a black and gray dragon"

Framework	PLMS	DDIM	DPM-Solver++	UniPC
MindOne
Diffusers

${prompt}="a cartoon panda with a leaf in its mouth"

Framework	PLMS	DDIM	DPM-Solver++	UniPC
MindOne
Diffusers

chinese_art dataset:

${prompt}="a painting of a landscape with a mountain in the background and a river running through it with a few people on it"

Framework	PLMS	DDIM	DPM-Solver++	UniPC
MindOne
Diffusers

4.3 Quantitative Comparison with Diffusers

Here are the evaluation results for our implementation.

Pretrained Model	Dataset	Finetune Method	Sampling Algorithm	FID (MindOne) ↓	FID (Diffusers) ↓
stable_diffusion_2.0_base	pokemon_blip	LoRA	PLMS (scale: 9, steps: 50)	103	105
stable_diffusion_2.0_base	pokemon_blip	LoRA	DDIM (scale: 9, steps: 50)	101	109
stable_diffusion_2.0_base	pokemon_blip	LoRA	DPM Solver ++ (scale: 9, steps: 20)	98	107
stable_diffusion_2.0_base	pokemon_blip	LoRA	UniPC (scale: 9, steps: 20)	104	107
stable_diffusion_2.0_base	chinese_art_blip	LoRA	PLMS (scale: 9, steps: 50)	279	260
stable_diffusion_2.0_base	chinese_art_blip	LoRA	DDIM (scale: 9, steps: 50)	277	250
stable_diffusion_2.0_base	chinese_art_blip	LoRA	DPM Solver ++ (scale: 9, steps: 20)	265	254
stable_diffusion_2.0_base	chinese_art_blip	LoRA	UniPC (scale: 9, steps: 20)	288	254

5. Inference with SD1.5

These different schedulers are also suitable for SD1.5 (See Stable Diffusion 1.5 for more detail). Users can specify schedulers in the same manner as described in chapter Inference with Different Schedulers, except switching SD from 2.0 to 1.5 by setting the --version (-v) argument. In this chapter, we will provide a visual text to image generation using different schedulers base on SD1.5.

5.1 Visual comparison

# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})
# Note: no need to pass ${scheduler} while using dpm_solver_pp
# Note: here, the ${version} is 1.5 and config file is "configs/v1-inference.yaml"
 python text_to_image.py  --prompt "A wolf in winter"  --config configs/v1-inference.yaml  --version 1.5  --seed 42  --${scheduler}  --n_iter 8  --n_samples 1  --sampling_steps ${sampling_steps}

PLMS	DDIM	DPM-Solver	DPM-Solver++	UniPC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schedulers.md

schedulers.md

Schedulers for Stable Diffusion Inference

1. Introduction

2. Summary of Schedulers

3. Inference with SD2.0

3.1 Quick Start

3.2 Inference with Different Schedulers

3.2.1 Commands of Inference with Different Schedulers

3.2.2 Visual Comparison

3.2.3 Time Comparison

4. Inference (based on LoRA)

4.1 Visual Comparison

4.2 Qualitative Comparison with Diffusers

4.3 Quantitative Comparison with Diffusers

5. Inference with SD1.5

5.1 Visual comparison

Files

schedulers.md

Latest commit

History

schedulers.md

File metadata and controls

Schedulers for Stable Diffusion Inference

1. Introduction

2. Summary of Schedulers

3. Inference with SD2.0

3.1 Quick Start

3.2 Inference with Different Schedulers

3.2.1 Commands of Inference with Different Schedulers

3.2.2 Visual Comparison

3.2.3 Time Comparison

4. Inference (based on LoRA)

4.1 Visual Comparison

4.2 Qualitative Comparison with Diffusers

4.3 Quantitative Comparison with Diffusers

5. Inference with SD1.5

5.1 Visual comparison