-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why the model quantification acceleration is not obvious ? #48
Comments
Does your model use FP16 or FP32? There are many ways to quantify models, which require experimentation. Some quantifications are full model quantification (the fastest speed and the most loss of accuracy), and some quantifications determine the degree of quantification based on the mAP loss before and after quantization. Please refer to: https://docs.openvinotoolkit.org/latest/pot_docs_BestPractices.html |
I use a CPU, from FP32 quantization to INT8, the image detection time is shortened from 47.9ms to 39.9ms, and the time of one video frame is shortened from 108ms to 77.6ms, which is about 1/4 reduction. |
Before deploying with openvino, it took about 26s for 300 images ,。
After deployment, the time was significantly increased to 12s, although not the same test code was used .
But after the quantification to int8 , the time was shortened to 9.2s, and the effect was not very obvious. What may be the cause? Is there any way to improve it? Thanks.
The text was updated successfully, but these errors were encountered: