You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ONNX Runtime is missing support for certain data types in some operations. This is currently handled in NDOnnx by casting the operand to a supported type before and casting back after the operation.
The problem is that this generates an inefficient ONNX graph without the user being aware.
Example
Here's an example code with uint64, for which ONNX Runtime doesn't implement Add.
In the "Implicit cast" case, I let NDOnnx do the conversion for each operation.
In the "Explicit cast" case, I just cast to int64 at the beginning of the whole computation and cast back at the end.
Cast with Warn - raise a Warning each time we do this casting
Error - don't do any casting and just fail if there is not a ONNX Runtime implementation for the given dtype
No Cast - proceed without cast, producing a graph that is not runnable by ONNX Runtime
I suggest "Cast with Warn" as the default mode. This lets the user know that we are producing an inefficient graph, and they can choose to use a different dtype instead, or just ignore the warning. Getting these warnings can also serve as a signal to upstream an implementation to ONNX Runtime.
I also suggest having the option to switch to the other modes via a flag.
The text was updated successfully, but these errors were encountered:
I agree with the general direction of this which is to give the user greater control over the generated graph. Warnings would certainly allow users to know when casting is occurring but they can do nothing about it other than step out of ndonnx and into Spox. This issue seems tightly connected to #42 and I think a better solution might be to allow users to specify what minimum version of ORT the generated graph will be executable on and then only apply appropriate casts for this version. This way a) we give the user some control over the compatibility characteristics they want out of the generated graph and b) we don't end up needing to make executive decisions on when is a reasonable amount of time to drop certain dtype workarounds (things are versioned all the way through).
The ONNX Runtime is missing support for certain data types in some operations. This is currently handled in NDOnnx by casting the operand to a supported type before and casting back after the operation.
The problem is that this generates an inefficient ONNX graph without the user being aware.
Example
Here's an example code with
uint64
, for which ONNX Runtime doesn't implementAdd
.In the "Implicit cast" case, I let NDOnnx do the conversion for each operation.
In the "Explicit cast" case, I just cast to
int64
at the beginning of the whole computation and cast back at the end.Result:
Models:
Implicit:
Explicit:
Suggested changes
There are at least 4 ways to handle these casts:
I suggest "Cast with Warn" as the default mode. This lets the user know that we are producing an inefficient graph, and they can choose to use a different dtype instead, or just ignore the warning. Getting these warnings can also serve as a signal to upstream an implementation to ONNX Runtime.
I also suggest having the option to switch to the other modes via a flag.
The text was updated successfully, but these errors were encountered: