-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Running image preprocessing model in onnx takes significant more time #19329
Comments
One of the reasons for the performance penalty might be the presence of if-else op in the graph By default, Exporting the model with |
Model was exported only with torch.onnx.export from native torch model class
|
If op comes from resize transform and removal of it increase execution speed to 0.15s per 72 images. How can i optimize this with presence of resize layer? |
There are tools in the onnxruntime_extensions package to do most of what you want by directly editing the ONNX model. If you were to export the base model, you could use something like this. import onnx
from onnxruntime_extensions.tools.pre_post_processing import *
onnx_opset = 17 # use opset 18 for Resize to antialias
model_path = "pytorch.mobilenet_v2_float.onnx"
model = onnx.load(model_path)
inputs = [create_named_value("image_tensor", onnx.TensorProto.UINT8, [3, "h", "w"])]
pipeline = PrePostProcessor(inputs, onnx_opset)
pipeline.add_pre_processing(
[
Resize(224, layout="CHW"), # Uses BILINEAR currently
ImageBytesToFloat(), # Convert to float in range 0..1 by dividing uint8 values by 255
Normalize([(0.5, 0.5), (0.5, 0.5), (0.5, 0.5)]), # (mean, stddev) for each channel
Unsqueeze([0]), # add batch, CHW --> 1CHW
]
)
new_model = pipeline.run(model)
output_path = model_path.replace(".onnx", ".withpreprocessing.onnx")
onnx.save_model(new_model, output_path) One issue is the Resize implementation in that tool defaults to bilinear currently as we haven't had a use-case that differed. That could be made configurable. Should be able to edit this line in the python in the package to change 'linear' to 'cubic' to achieve that. Or add this hack to the script that adds pre-processing to the model before saving the updated model. # hack to change linear to cubic
for n in new_model.graph.node:
if n.op_type == "Resize":
for attr in n.attribute:
if attr.name == "mode":
attr.s = "cubic".encode("UTF-8") You can also do conversion from jpg/png to bytes inside the ONNX model if you take a dependency on the onnxruntime_extensions package. That uses opencv to do the conversion. See the example usage info for an overview and these docs for info on individual pre/post processing steps. |
I checked that using bilinear interpolation in preprocessing does not change model accuracy at all, so this is not an issue. However, i failed to use this model after I add ort-extensions pp steps to it with your code. It looks ok to me in Netron and i checked it with onnx.checker. Original model have fixed input/output shape: [1, 3, 224, 224] -> [1, 28] When i run model with following code i get an error:
Code:
In code for modifying model with ort-extensions i changed "inputs" name from "image_tensor" to "image", seems like Resize layer have fixed input name and model input should match it. I tried different policies in Resize as well. |
Created issue in onnxruntime-extensions for this, as more specific |
Describe the issue
I have a torch vision transformer model and a torch preprocessing model. I converted both of them separately to onnx format. Issue related to preprocessing model, conversion code for it given below.
After conversion, i compared inferences of model pairs (torch processor + torch transformer, onnx processor + onnx transformer) on 72 input images. Processor model processes images sequentially, both inferences were running on CPU. Total time of inferenceSession runs were measured for onnx and total time of forwards were measured for torch.
Torch preprocessing time: 0.21s
Torch classification on transformer time: 2.92s
Onnx preprocessing time: 6.29s
Onnx classification on transformer time: 1.08s
Preprocessing step on onnx takes 30x time of torch preprocessing step. I think that after conversion model may have some operators that causing performance issues/that working slow because of onnx-runtime specificity. Can you help me with that?
To reproduce
Torch preprocessing model:
Conversion to onnx for preprocessing model:
Attaching onnx preprocessing model
processor.zip
Urgency
Runtime optimization for models is important for successful demonstration with clients
Platform
Windows
OS Version
22H2
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.16.3
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
Added in "To reproduce section"
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: