Output mismatch of torch.sin due to an extra torch.Tensor.transpose node with and without optimization #18228

Azyka · 2023-11-02T02:34:59Z

Describe the issue

ONNX opset version: 14

When adding an extra torch.Tensor.transpose node as output, the original output of torch.sin is changed with and without optimization, causing mismatch of model outputs.

To reproduce

Input data:
input_data.zip

Sample code:

import onnxruntime as ort
import onnx
import numpy as np
import pickle
from numpy import testing
import torch

class Model0(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, *args):
        _args = args
        getitem = _args[0]
        mul = torch.mul(getitem, getitem)
        to = mul.to(dtype = torch.float64)
        sin = torch.sin(to)
        return (sin)

model_0 = Model0()
output_names_0 = ['v5_0']
input_dict_0 = pickle.load(open('./0.pickle', 'rb'))
inputs_0 = tuple(torch.from_numpy(v).to('cpu') for _, v in input_dict_0.items())
torch.onnx.export(model_0, inputs_0, '0.onnx', verbose=False, input_names=['v0_0'], output_names=output_names_0, opset_version=14, do_constant_folding=False)

class Model1(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, *args):
        _args = args
        getitem = _args[0]
        mul = torch.mul(getitem, getitem)
        transpose = mul.transpose(1, 0)
        to = mul.to(dtype = torch.float64)
        sin = torch.sin(to)
        return (transpose, sin)

model_1 = Model1()
output_names_1 = ['v6_0', 'v8_0']
input_dict_1 = pickle.load(open('./1.pickle', 'rb'))
inputs_1 = tuple(torch.from_numpy(v).to('cpu') for _, v in input_dict_1.items())
torch.onnx.export(model_1, inputs_1, '1.onnx', verbose=False, input_names=['v0_0'], output_names=output_names_1, opset_version=14, do_constant_folding=False)

sess_options_0 = ort.SessionOptions()
sess_options_0.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
sess_0 = ort.InferenceSession('0.onnx',providers=['CPUExecutionProvider'],sess_options=sess_options_0)
sess_res_0 = sess_0.run(output_names_0, input_dict_0)
output_0 = dict(zip(output_names_0, sess_res_0))

sess_options_1 = ort.SessionOptions()
sess_options_1.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
sess_1 = ort.InferenceSession('1.onnx',providers=['CPUExecutionProvider'],sess_options=sess_options_1)
sess_res_1 = sess_1.run(output_names_1, input_dict_1)
output_1 = dict(zip(output_names_1, sess_res_1))
output_name_dict = {'v5_0': 'v8_0'}

print('=========================')
try:
    for tensor_name_0, tensor_name_1 in output_name_dict.items():
        testing.assert_allclose(output_0[tensor_name_0], output_1[tensor_name_1])
    print("onnxruntime_enable_opt does not trigger assertion")
except AssertionError as e:
    print("onnxruntime_enable_opt triggers assertion")
    print(e)
print('=========================')

sess_options_0 = ort.SessionOptions()
sess_options_0.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
sess_0 = ort.InferenceSession('0.onnx',providers=['CPUExecutionProvider'],sess_options=sess_options_0)
sess_res_0 = sess_0.run(output_names_0, input_dict_0)
output_0 = dict(zip(output_names_0, sess_res_0))

sess_options_1 = ort.SessionOptions()
sess_options_1.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
sess_1 = ort.InferenceSession('1.onnx',providers=['CPUExecutionProvider'],sess_options=sess_options_1)
sess_res_1 = sess_1.run(output_names_1, input_dict_1)
output_1 = dict(zip(output_names_1, sess_res_1))

print('=========================')
try:
    for tensor_name_0, tensor_name_1 in output_name_dict.items():
        testing.assert_allclose(output_0[tensor_name_0], output_1[tensor_name_1])
    print("onnxruntime_disable_opt does not trigger assertion")
except AssertionError as e:
    print("onnxruntime_disable_opt triggers assertion")
    print(e)
print('=========================')

Output:

=========================
onnxruntime_enable_opt triggers assertion

Not equal to tolerance rtol=1e-07, atol=0

Mismatched elements: 7377 / 7552 (97.7%)
Max absolute difference: 0.01560554
Max relative difference: 1.20246471
 x: array([[[ 0.992723, -0.995001, -0.76084 , ...,  0.997214,  0.506929,
          0.994631],
        [-0.94864 , -0.883285, -0.684442, ...,  0.535824,  0.388547,...
 y: array([[[ 0.994278, -0.99566 , -0.756076, ...,  0.997214,  0.494885,
          0.993765],
        [-0.950799, -0.886123, -0.685875, ...,  0.541583,  0.392368,...
=========================
=========================
onnxruntime_disable_opt triggers assertion

Not equal to tolerance rtol=1e-07, atol=0

Mismatched elements: 7377 / 7552 (97.7%)
Max absolute difference: 0.01560554
Max relative difference: 1.20246471
 x: array([[[ 0.992723, -0.995001, -0.76084 , ...,  0.997214,  0.506929,
          0.994631],
        [-0.94864 , -0.883285, -0.684442, ...,  0.535824,  0.388547,...
 y: array([[[ 0.994278, -0.99566 , -0.756076, ...,  0.997214,  0.494885,
          0.993765],
        [-0.950799, -0.886123, -0.685875, ...,  0.541583,  0.392368,...
=========================

Urgency

This is an incorrect functionality implementation. It may cause severe bugs for those systems on the top of ORT.

Platform

Linux

OS Version

Ubuntu 22.04.3 LTS (x86_64)

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2023-12-02T15:00:53Z

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions · 2024-01-02T15:00:55Z

This issue has been automatically closed due to inactivity. Please reactivate if further support is needed.

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Dec 2, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output mismatch of torch.sin due to an extra torch.Tensor.transpose node with and without optimization #18228

Output mismatch of torch.sin due to an extra torch.Tensor.transpose node with and without optimization #18228

Azyka commented Nov 2, 2023 •

edited

Loading

github-actions bot commented Dec 2, 2023

github-actions bot commented Jan 2, 2024

Output mismatch of torch.sin due to an extra torch.Tensor.transpose node with and without optimization #18228

Output mismatch of torch.sin due to an extra torch.Tensor.transpose node with and without optimization #18228

Comments

Azyka commented Nov 2, 2023 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

github-actions bot commented Dec 2, 2023

github-actions bot commented Jan 2, 2024

Azyka commented Nov 2, 2023 •

edited

Loading