Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix REDUCE_PROD #32

Open
tomsmeding opened this issue Jun 12, 2023 · 2 comments
Open

Fix REDUCE_PROD #32

tomsmeding opened this issue Jun 12, 2023 · 2 comments

Comments

@tomsmeding
Copy link
Member

Just creating this issue to document ongoing investigation.

DIM1
        product: FAIL (1.52s)
            ✗ <interactive> failed at test/nofib/Data/Array/Accelerate/Test/NoFib/Prelude/Fold.hs:84:3
              after 1 test.
              shrink path: 1:

              ━━━ Exception (ErrorCall) ━━━
              edgetpu_compiler --show_operations --out_dir=/tmp/dpvanbalen/ /tmp/dpvanbalen/model3866256-1.tflite (exit 1)
              ERROR: Didn't find op for builtin opcode 'REDUCE_PROD' version '2'. An older version of this builtin might be supported. Are you using an old TFLite binary with a newer model?

              ERROR: Registration failed.

              Invalid model: /tmp/dpvanbalen/model3866256-1.tflite
              Model could not be parsed

Since TF 2.6, with TF PR #49735, TFLite generates REDUCE_PROD version 2 in certain cases. The closed-source edgetpu-compiler (which we suspect is based on TF <2.6) apparently doesn't support REDUCE_PROD version 2. With the following diff on TF 2.10.1, the product test does seem to work:

diff --git a/tensorflow/lite/kernels/register.cc b/tensorflow/lite/kernels/register.cc
index 83da3d319cd..8146888b4eb 100644
--- a/tensorflow/lite/kernels/register.cc
+++ b/tensorflow/lite/kernels/register.cc
@@ -224,7 +224,7 @@ BuiltinOpResolver::BuiltinOpResolver() {
              /* max_version = */ 2);
   AddBuiltin(BuiltinOperator_REDUCE_PROD, Register_REDUCE_PROD(),
              /* min_version = */ 1,
-             /* max_version = */ 2);
+             /* max_version = */ 1);
   AddBuiltin(BuiltinOperator_REDUCE_MAX, Register_REDUCE_MAX(),
              /* min_version = */ 1,
              /* max_version = */ 3);
diff --git a/tensorflow/lite/kernels/register_ref.cc b/tensorflow/lite/kernels/register_ref.cc
index 8e6329b7caf..22089702d73 100644
--- a/tensorflow/lite/kernels/register_ref.cc
+++ b/tensorflow/lite/kernels/register_ref.cc
@@ -401,7 +401,7 @@ BuiltinRefOpResolver::BuiltinRefOpResolver() {
              /* max_version = */ 2);
   AddBuiltin(BuiltinOperator_REDUCE_PROD, Register_REDUCE_PROD_REF(),
              /* min_version = */ 1,
-             /* max_version = */ 2);
+             /* max_version = */ 1);
   AddBuiltin(BuiltinOperator_REDUCE_MAX, Register_REDUCE_MAX_REF(),
              /* min_version = */ 1,
              /* max_version = */ 3);
diff --git a/tensorflow/lite/tools/optimize/operator_property.cc b/tensorflow/lite/tools/optimize/operator_property.cc
index e9b960536da..09a19a167e1 100644
--- a/tensorflow/lite/tools/optimize/operator_property.cc
+++ b/tensorflow/lite/tools/optimize/operator_property.cc
@@ -1096,7 +1096,7 @@ OperatorProperty GetOperatorProperty(OpVariant op_variant) {
     case BuiltinOperator_REDUCE_PROD:
       property.inputs = {{0, {}}};
       property.outputs = {{0, {}}};
-      property.version = 2;
+      property.version = 1;
       break;
     case BuiltinOperator_REDUCE_MAX:
     case BuiltinOperator_REDUCE_MIN:
diff --git a/tensorflow/lite/tools/versioning/op_version.cc b/tensorflow/lite/tools/versioning/op_version.cc
index b11a13354f8..87f40f150e2 100644
--- a/tensorflow/lite/tools/versioning/op_version.cc
+++ b/tensorflow/lite/tools/versioning/op_version.cc
@@ -805,10 +805,10 @@ int GetBuiltinOperatorVersion(const OpSignature& op_sig) {
       return 1;
 
     case BuiltinOperator_REDUCE_PROD:
-      if (op_sig.inputs.at(0).type == kTfLiteInt8 ||
-          op_sig.inputs.at(0).type == kTfLiteInt16) {
-        return 2;
-      }
+      // if (op_sig.inputs.at(0).type == kTfLiteInt8 ||
+      //     op_sig.inputs.at(0).type == kTfLiteInt16) {
+      //   return 2;
+      // }
       return 1;
 
     // The version one of broadcast to op won't be not supported since the
diff --git a/tensorflow/lite/tools/versioning/runtime_version.cc b/tensorflow/lite/tools/versioning/runtime_version.cc
index d5745c8d7c2..ce9c14b16b5 100644
--- a/tensorflow/lite/tools/versioning/runtime_version.cc
+++ b/tensorflow/lite/tools/versioning/runtime_version.cc
@@ -203,7 +203,7 @@ std::string FindMinimumRuntimeVersionForOp(tflite::BuiltinOperator op_code,
            {{BuiltinOperator_REDUCE_MIN, 2}, "1.14.0"},
            {{BuiltinOperator_REDUCE_MIN, 3}, "2.5.0"},
            {{BuiltinOperator_REDUCE_PROD, 1}, "1.11.0"},
-           {{BuiltinOperator_REDUCE_PROD, 2}, "2.6.0"},
+           // {{BuiltinOperator_REDUCE_PROD, 2}, "2.6.0"},
            {{BuiltinOperator_REDUCE_ANY, 1}, "1.11.0"},
            {{BuiltinOperator_RELU6, 1}, "1.5.0"},
            {{BuiltinOperator_RELU6, 2}, "1.14.0"},

but just applying that without understanding the implications seems a very bad idea.

@tomsmeding
Copy link
Member Author

tomsmeding commented Jun 13, 2023

Comparing https://www.tensorflow.org/versions/r2.3/api_docs/cc and https://www.tensorflow.org/versions/r2.4/api_docs/cc we see that TF 2.4 introduced tensorflow::ops::QuantizeAndDequantizeV4.
strings edgetpu_compiler includes QuantizeAndDequantizeV4 so TF >= 2.4.

Comparing https://www.tensorflow.org/versions/r2.5/api_docs/cc and https://www.tensorflow.org/versions/r2.6/api_docs/cc we see that TF 2.6 introduced tensorflow::ops::BatchMatMulV3.
strings edgetpu_compiler includes tf.BatchMatMulV3 so TF >= 2.6.


On the other hand:
In v2.6.0 in tensorflow/core/kernels/linalg/tridiagonal_solve_op.cc we have:

  explicit TridiagonalSolveOp(OpKernelConstruction* context) : Base(context) {
    OP_REQUIRES_OK(context, context->GetAttr("partial_pivoting", &pivoting_));
    perturb_singular_ = false;
    if (context->HasAttr("perturb_singular")) {
      OP_REQUIRES_OK(context,
                     context->GetAttr("perturb_singular", &perturb_singular_));
    }
    OP_REQUIRES(context, pivoting_ || !perturb_singular_,
                errors::InvalidArgument("Setting perturb_singular requires "
                                        "also setting partial_pivoting."));
  }

whereas in v2.5.0 we have:

  explicit TridiagonalSolveOp(OpKernelConstruction* context) : Base(context) {
    OP_REQUIRES_OK(context, context->GetAttr("partial_pivoting", &pivoting_));
  }

and strings edgetpu_compiler includes partial_pivoting but not perturb_singular. Hence TF < 2.6.

Furthermore strings edgetpu_compiler includes

'tf.EnqueueTPUEmbeddingSparseTensorBatch' op attribute 'num_features' failed to satisfy constraint: 64-bit integer array attribute

and

git show v2.5.0:tensorflow/core/ops/tpu_embedding_ops.cc | grep -A 20 'EnqueueTPUEmbeddingSparseTensorBatch'

includes num_features whereas

git show v2.4.0:tensorflow/core/ops/tpu_embedding_ops.cc | grep -A 20 'EnqueueTPUEmbeddingSparseTensorBatch'

does not.

Hence TF > 2.4.

In conclusion, two contradictory signals, but given that REDUCE_PROD switched to version 2 in TF 2.6 and edgetpu_compiler doesn't understand version 2, it seems likely that indeed TF < 2.6 holds.

@tomsmeding
Copy link
Member Author

I contacted Coral about this and they responded as follows:

Hello Tom,

Thanks for reaching out!

Edgetpu_compiler internally uses Tensorflow 2.5.0.

Unofrtunatley, there are no planned updates to the compiler in the near future.

Thanks Coral Support Team

So it seems that we will continue using the patch to TF 2.10.1. @mikesperber

tomsmeding added a commit to tomsmeding/tensorflow that referenced this issue Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant