You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mindone contains multiple schedule functions for the diffusion process.
The schedule functions take in the output of a pre-trained model, a sample which the diffusion process is iterating on, and a timestep to return a denoised sample. Schedulers define the method for iteratively adding noise to an image or for updating a sample based on model outputs (removing noise). Schedulers are often defined by a noise schedule and an update rule to solve the differential equation solution.
2. Summary of Schedulers
Mindone implements 5 different schedulers in addition to the DDPM scheduler. The following table summarizes these schedulers:
Normally, you can test the stable diffusion model using the following command with the default DPM-Solver++ scheduler (Refer to Stable Diffusion 2.0-Inference).
# Text to image generation with SD2.0
python text_to_image.py --prompt "A wolf in winter" --version 2.0
You can obtain diverse results according to the given prompt. Here are 5 examples:
DPM-Solver++ #1
DPM-Solver++ #2
DPM-Solver++ #3
DPM-Solver++ #4
DPM-Solver++ #5
3.2 Inference with Different Schedulers
As quantitative evaluation for different schedulers is usually insufficient to determine which is the best, it is often recommended to simply try them out and visually compare the results.
In this chapter, we will demonstrate how to generate images using different schedulers, and show you the visual comparison of images, the time required for each scheduler for reference.
3.2.1 Commands of Inference with Different Schedulers
You can test the stable diffusion model with different scheduler using the following commands.
The ${scheduler} can be one of ddim, plms, dpm_solver and uni_pc (note: the default scheduler is DPM-Solver++).
The optimal Hyperparameters can vary for different schedulers. For example, the optimal ${sampling_steps} is 50 for PLMS, DDIM and 20 for UniPC,DPM-Solver, DPM-Solver++.
# Scheduler PLMS: set ${scheduler} to plms and ${sampling_steps} to 50# Scheduler DDIM: set ${scheduler} to ddim and ${sampling_steps} to 50# Scheduler DPM-Solver: set ${scheduler} to dpm_solver and ${sampling_steps} to 20# Scheduler DPM-Solver++: set ${sampling_steps} to 20 (Note: in this case, no need to pass the ${scheduler} argument)# Scheduler UniPC: set ${scheduler} to uni_pc and ${sampling_steps} to 20# ${prompt} be "A Van Gogh style oil painting of sunflower" as default.
python text_to_image.py \
--prompt ${prompt} \
--config configs/v2-inference.yaml \
--version 2.0 \
--output_path ./output/ \
--seed 42 \
--${scheduler} \
--n_iter 8 \
--n_samples 1 \
--W 512 \
--H 512 \
--sampling_steps ${sampling_steps}
Note that, If you set big sampling steps for UniPC schedulers, the program will report a warning such as The selected sampling timesteps are not appropriate for UniPC sampler. The default ${prompt} is "A Van Gogh style oil painting of sunflower".
3.2.2 Visual Comparison
Now, let's execute the command above to generate some images with different ${prompt} and different ${scheduler}. All these images have a default resolution of 512x512.
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py --prompt "A Van Gogh style oil painting of sunflower" --config configs/v2-inference.yaml --version 2.0 --seed 42 --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py --prompt "A photo of an astronaut riding a horse on mars" --config configs/v2-inference.yaml --version 2.0 --seed 42 --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py --prompt "A high tech solarpunk utopia in the Amazon rainforest" --config configs/v2-inference.yaml --version 2.0 --seed 42 --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py --prompt "The beautiful night view of the city has various buildings, traffic flow, and lights" --config configs/v2-inference.yaml --version 2.0 --seed 42 --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp
python text_to_image.py --prompt "A pikachu fine dining with a view to the Eiffel Tower" --config configs/v2-inference.yaml --version 2.0 --seed 42 --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
3.2.3 Time Comparison
We made the following table to compare the time required for different schedulers (with their optimal sampling steps) to generate images by executing
the command above on Ascend 910. This means that the image resolution is 521x512, n_iter is 8, n_samples is 1(calculate the average of the last 7 iterations).
Please note that if n_samples is increased (e.g., 8), the time required for each sample will decrease.
scheduler
sampling_steps
time(second/image)
ddim
50
16.48s
plms
50
16.77s
dmp_solver
20
12.72s
dmp_solver_pp
20
13.43s
uni_pc
20
14.97s
4. Inference (based on LoRA)
The difference schedulers can be used for stable diffusion model + LoRA inference (see LoRA for more information).
Users can specify schedulers in the same manner as described in chapter Inference with Different Schedulers.
For detailed additional information, please refer to Use LoRA for Stable Diffusion Finetune.
In this chapter, we will provide a visual, qualitative and quantitative comparison with Diffusers using different schedulers based on LoRa finetune.
4.1 Visual Comparison
Based on the LoRA models trained on pokemon and chinese_art datasets (see LoRA for more information), we test them using different schedulers. The base model is Stable Diffusion 2.0.
pokemon dataset:
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp# Set ${prompt} to "a drawing of a blue and white cat with big eyes"
bash scripts/run_test_to_image_v2_lora.sh
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp# Set ${prompt} to "a cartoon of a black and white pokemon"
bash scripts/run_test_to_image_v2_lora.sh
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
chinese_art dataset:
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp# Set ${prompt} to "a painting of a group of people sitting on a hill with trees in the background and a stream of water"
bash scripts/run_test_to_image_v2_lora.sh
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp# Set ${prompt} to "a drawing of a village with a boat and a house in the background with a red ribbon on the bottom of the picture"
bash scripts/run_test_to_image_v2_lora.sh
PLMS
DDIM
DPM-Solver
DPM-Solver++
UniPC
4.2 Qualitative Comparison with Diffusers
We also show some text-to-image generation samples for the LoRA models trained by MindOne and Diffusers. The base model is Stable Diffusion 2.0.
pokemon dataset:
${prompt}="a drawing of a black and gray dragon"
Framework
PLMS
DDIM
DPM-Solver++
UniPC
MindOne
Diffusers
${prompt}="a cartoon panda with a leaf in its mouth"
Framework
PLMS
DDIM
DPM-Solver++
UniPC
MindOne
Diffusers
chinese_art dataset:
${prompt}="a painting of a landscape with a mountain in the background and a river running through it with a few people on it"
Framework
PLMS
DDIM
DPM-Solver++
UniPC
MindOne
Diffusers
4.3 Quantitative Comparison with Diffusers
Here are the evaluation results for our implementation.
Pretrained Model
Dataset
Finetune Method
Sampling Algorithm
FID (MindOne) ↓
FID (Diffusers) ↓
stable_diffusion_2.0_base
pokemon_blip
LoRA
PLMS (scale: 9, steps: 50)
103
105
stable_diffusion_2.0_base
pokemon_blip
LoRA
DDIM (scale: 9, steps: 50)
101
109
stable_diffusion_2.0_base
pokemon_blip
LoRA
DPM Solver ++ (scale: 9, steps: 20)
98
107
stable_diffusion_2.0_base
pokemon_blip
LoRA
UniPC (scale: 9, steps: 20)
104
107
stable_diffusion_2.0_base
chinese_art_blip
LoRA
PLMS (scale: 9, steps: 50)
279
260
stable_diffusion_2.0_base
chinese_art_blip
LoRA
DDIM (scale: 9, steps: 50)
277
250
stable_diffusion_2.0_base
chinese_art_blip
LoRA
DPM Solver ++ (scale: 9, steps: 20)
265
254
stable_diffusion_2.0_base
chinese_art_blip
LoRA
UniPC (scale: 9, steps: 20)
288
254
5. Inference with SD1.5
These different schedulers are also suitable for SD1.5 (See Stable Diffusion 1.5 for more detail).
Users can specify schedulers in the same manner as described in chapter Inference with Different Schedulers, except
switching SD from 2.0 to 1.5 by setting the --version (-v) argument. In this chapter, we will provide a visual text to image generation using different
schedulers base on SD1.5.
5.1 Visual comparison
# Set ${scheduler} to plms, ddim, dpm_solver and uni_pc in turn (Note: also use their respective optimal ${sampling_steps})# Note: no need to pass ${scheduler} while using dpm_solver_pp# Note: here, the ${version} is 1.5 and config file is "configs/v1-inference.yaml"
python text_to_image.py --prompt "A wolf in winter" --config configs/v1-inference.yaml --version 1.5 --seed 42 --${scheduler} --n_iter 8 --n_samples 1 --sampling_steps ${sampling_steps}