You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there. I tried to quantize the mars-model used in deepsort tracking. Using the example in image_classification/cpu from onnxruntime-inference-examples repo, I am able to quantize my mars model. Size of the model has reduced after quantization. But inference speed of the quantized model has not increased. It is very much similar to my unquantized model. What could be the problem here? I will mention the steps I did to quantize.
Firstly. mars model used in deepsort repo is a tensorflow .pb model. I took that model and then converted it into onnx format using tf2onnx utility. Now on this onnx model, I have applied static quantization as described in the example under quantization/image_classification/cpu in onnxruntime-inference-examples repo . I successfully got the quantized onnx model which is smaller in size. But the issue is with inference speed which has not increased. Any help is appreciated.
To reproduce
Take the mars-small128.pb model from deepsort repo and then convert it into onnx using tf2onnx utility. Then quantize this onnx model using onnxruntime using static quantization. While quantizing,make appropriate changes for preprocessing the calibration dataset images as this models input shape is ['unk__281', 128, 64, 3] and models input datatype is uint8 and not float32. This models input datatype and input shape differ from standard shapes and types used for cnn's. So make changes in the preprocessing function and input data to match models shape and datatype. After this if you quantize, you will get a quantized model of reduced size. But there is no increase in inference speed of this model as well.
Urgency
Urgent as I am using this model for tracking in my project and I am approaching the delivery date.
Hello @yufenglee Thanks for your reply. Yes, I am running on cpu. I am using google colab's cpu to do that. I also tried on cpu of my mac m1. But it didn't work on either of them. Model size on disk has been reduced to around 3MB from around 11MB. But there is no increase in speed. Please have a look at it on an urgent basis. Your help is appreciated.
Describe the issue
Hi there. I tried to quantize the mars-model used in deepsort tracking. Using the example in
image_classification/cpu
from onnxruntime-inference-examples repo, I am able to quantize my mars model. Size of the model has reduced after quantization. But inference speed of the quantized model has not increased. It is very much similar to my unquantized model. What could be the problem here? I will mention the steps I did to quantize.Firstly. mars model used in deepsort repo is a tensorflow .pb model. I took that model and then converted it into onnx format using tf2onnx utility. Now on this onnx model, I have applied static quantization as described in the example under
quantization/image_classification/cpu
in onnxruntime-inference-examples repo . I successfully got the quantized onnx model which is smaller in size. But the issue is with inference speed which has not increased. Any help is appreciated.To reproduce
Take the mars-small128.pb model from deepsort repo and then convert it into onnx using tf2onnx utility. Then quantize this onnx model using onnxruntime using static quantization. While quantizing,make appropriate changes for preprocessing the calibration dataset images as this models input shape is ['unk__281', 128, 64, 3] and models input datatype is uint8 and not float32. This models input datatype and input shape differ from standard shapes and types used for cnn's. So make changes in the preprocessing function and input data to match models shape and datatype. After this if you quantize, you will get a quantized model of reduced size. But there is no increase in inference speed of this model as well.
Urgency
Urgent as I am using this model for tracking in my project and I am approaching the delivery date.
Platform
Linux
OS Version
colab
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.16
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Model File
tracking_models.zip
Is this a quantized model?
Yes
The text was updated successfully, but these errors were encountered: