You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vid: This is a torch tensor with the size [91200, 33].
timestep: A tensor with a single float value, like [988.] or [1000.], on ‘npu:1’ with dtype torch.bfloat16.
cfg_scale: This can be a tensor like [2.] or it can be None.
vid = torch.randn(91200, 33)
timestep = torch.tensor([988.])
cfg_scale = torch.tensor([2.]) or None
dynamic_axes = {
'vid': {0: 'batch_size'}, # The first dimension of vid is dynamic
'timestep': {0: 'batch_size'}, # The first dimension of timestep is dynamic 'cfg_scale': ???
'output': {0: 'batch_size'} # The first dimension of output is dynamic
}
stable diffusion uses scale <= 1.0 to do classifier free guidance, and the only change is the text embedding batch size. The unet onnx model is still the same.
If you want to convert the guidance code [like https://github.com/microsoft/onnxruntime/blob/6d9636f07cccdb6e4ac453087ad54c3bc9854d50/onnxruntime/python/tools/transformers/models/stable_diffusion/pipeline_stable_diffusion.py#L458-L460] to onnx model, you can generate two models: one with cfg_scale > 1; another for cfg_scale <= 1.0. Another approach is to use If onnx operator: https://onnx.ai/onnx/operators/onnx__If.html.
Describe the feature request
how to process the cfg_scale?
Describe scenario use case
vid: This is a torch tensor with the size [91200, 33].
timestep: A tensor with a single float value, like [988.] or [1000.], on ‘npu:1’ with dtype torch.bfloat16.
cfg_scale: This can be a tensor like [2.] or it can be None.
vid = torch.randn(91200, 33)
timestep = torch.tensor([988.])
cfg_scale = torch.tensor([2.]) or None
dynamic_axes = {
'vid': {0: 'batch_size'}, # The first dimension of vid is dynamic
'timestep': {0: 'batch_size'}, # The first dimension of timestep is dynamic
'cfg_scale': ???
'output': {0: 'batch_size'} # The first dimension of output is dynamic
}
torch.onnx.export(
model,
(vid, timestep, cfg_scale),
"your_model.onnx",
opset_version=17,
do_constant_folding=True,
input_names=['vid', 'timestep', 'cfg_scale'],
output_names=['output'],
dynamic_axes=dynamic_axes
)
how to process the cfg_scale?
The text was updated successfully, but these errors were encountered: