ONNX Runtime 1.18.1 CUDA 12.4 cuDNN 9.2 breaks inference with repeated inputs when enable_mem_reuse is enabled #21349

SystemPanic · 2024-07-14T08:01:21Z

Describe the issue

When inferencing a model with the same input data in the same session two or more times (so enable_mem_reuse takes into action), the following error is raised:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'/Reshape_1' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/tensor/reshape_helper.h:30 onnxruntime::ReshapeHelper::ReshapeHelper i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.

The error does not happen if enable_mem_reuse is disabled.

To reproduce

Download the following example onnx model (it happens with more models) from Piper repo: https://huggingface.co/rhasspy/piper-voices/resolve/main/ka/ka_GE/natia/medium/ka_GE-natia-medium.onnx
Then execute the following code to raise the error:

import onnxruntime
import numpy as np
session_options = onnxruntime.SessionOptions()
session_options.enable_mem_reuse = True
session = onnxruntime.InferenceSession(
    'ka_GE-natia-medium.onnx',
    sess_options=session_options,
    providers=["CUDAExecutionProvider"]
)
input = np.asarray([[1, 30, 0, 120, 0, 27, 0, 25, 0, 18, 0, 24, 0, 121, 0, 21, 0, 14, 0, 3, 0, 25, 0, 120, 0, 14, 0, 2]], dtype=np.int64)
input_lengths = np.asarray([28], dtype=np.int64)
scales = np.asarray([0.6669999957084656, 1.0, 0.800000011920929], dtype=np.float32)
sid = None
audio = session.run(
    None,
    {
        "input": input,
        "input_lengths": input_lengths,
        "scales": scales,
        "sid": sid,
    },
)[0]
print(audio)
audio = session.run(
    None,
    {
        "input": input,
        "input_lengths": input_lengths,
        "scales": scales,
        "sid": sid,
    },
)[0]
print(audio)

Finally, just change session_options.enable_mem_reuse to False and execute the code again, it works.

Urgency

No response

Platform

Windows

OS Version

10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU, CUDA

Execution Provider Library Version

CUDA 12.4, cuDNN 9.2

The text was updated successfully, but these errors were encountered:

github-actions · 2024-08-18T15:00:54Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions bot added ep:CUDA issues related to the CUDA execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:windows issues related to the Windows platform labels Jul 14, 2024

sophies927 added the api issues related to all other APIs: C, C++, Python, etc. label Jul 18, 2024

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Aug 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX Runtime 1.18.1 CUDA 12.4 cuDNN 9.2 breaks inference with repeated inputs when enable_mem_reuse is enabled #21349

ONNX Runtime 1.18.1 CUDA 12.4 cuDNN 9.2 breaks inference with repeated inputs when enable_mem_reuse is enabled #21349

SystemPanic commented Jul 14, 2024

github-actions bot commented Aug 18, 2024

ONNX Runtime 1.18.1 CUDA 12.4 cuDNN 9.2 breaks inference with repeated inputs when enable_mem_reuse is enabled #21349

ONNX Runtime 1.18.1 CUDA 12.4 cuDNN 9.2 breaks inference with repeated inputs when enable_mem_reuse is enabled #21349

Comments

SystemPanic commented Jul 14, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

github-actions bot commented Aug 18, 2024