-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Int8 quantization for microcontroller #736
Comments
Any pointers on how to solve this? |
How did you conclude that it does not result in a quantized model? I got a INT8/BNN model by running the code snippet below: from pathlib import Path
import larq_zoo
import larq_compute_engine as lce
import tensorflow as tf
keras_model = larq_zoo.literature.BinaryDenseNet28(
input_shape=None,
input_tensor=None,
weights="imagenet",
include_top=True,
num_classes=1000
)
tflite_model = lce.convert_keras_model(keras_model,
inference_input_type=tf.int8,
inference_output_type=tf.int8,
experimental_default_int8_range=(-3,3)
)
Path("model.tflite").write_bytes(tflite_model) I inspected the resulting |
I tried the following: model_a = lqz.literature.BinaryDenseNet28(
input_shape=(32,32,3),
weights=None,
include_top=True,
num_classes=10
)
with lq.context.quantized_scope(True):
weights = model_a.get_weights()
model_a.set_weights(weights)
tflite_model = lqce.convert_keras_model(model_a,
inference_input_type=tf.int8,
inference_output_type=tf.int8,
experimental_default_int8_range=(-3,3)
)
Path("model.tflite").write_bytes(tflite_model) Now if I submit it to |
I do not get the above mentioned error when I use your snippet. Maybe I am missing something here |
Indeed, if I run your snippet I get a model that has a float layer somewhere in the middle. You can see this in Netron: @Tombana , do you perhaps have any idea why this might happen? |
I'm not sure, this sounds like a bug in the converter, it might be in the tensorflow converter itself. From what I see in the two code snippets, the difference is only in with lq.context.quantized_scope(True):
weights = model_a.get_weights()
model_a.set_weights(weights) That shouldn't affect the outcome of the converter though. Which version of |
I tested converting removing this
but it does not work.
|
I think the problem occurs if we specify |
Interesting. Thank you for looking into it. |
My colleague @lgeiger pointed me to a similar LCE issue: #421, which in turn points to an unresolved TensorFlow issue: tensorflow/tensorflow#40055. I believe your issue might be the same. Let's look at the model produced by the following code in Netron: x = tf.keras.layers.Conv2D(64, kernel_size=3)(x)
x = tf.keras.layers.MaxPool2D(3)(x)
x = tf.keras.layers.BatchNormalization()(x) I see two tensors of the same size and dimensions: one for the bias of the Until this is solved in TensorFlow (which might take a very long time if ever), the work-around is to make sure this doesn't happen. There are several options. One option is to train the model for one step, since that will already change both tensors to non-zero, and the chance that they are equal is minimal. Or do a full training session or load pre-trained weights. The alternative is to initialize your model such that this doesn't happen, e.g.as follows: x = tf.keras.layers.Conv2D(64, kernel_size=3, bias_initializer=tf.keras.initializers.Constant(0.1))(x)
x = tf.keras.layers.MaxPool2D(3)(x)
x = tf.keras.layers.BatchNormalization()(x) So I propose to close this issue if @aqibsaeed agrees, and keep the other LCE issue open to track the bug in TensorFlow. |
Got it! I just double checked if I load model weights, conversion works fine. Thanks again for looking into this. Closing this now. |
Hi,
Is there a way to get a int8 quantized model that can eventually run on a microcontroller? I am converting a binary densenet keras model (https://docs.larq.dev/zoo/api/literature/#binarydensenet28) as follows but it does not result in a quantized model. Am I missing something here?
Thanks in advance.
The text was updated successfully, but these errors were encountered: