FusedMHARunnerFP16v2 will cause onnxruntime coredump when multi-host-threads run session.run() #22262

zwyao · 2024-09-29T06:58:35Z

Describe the issue

in my bert model，when i use head-size == 32，the attention cuda kernel will cause ort codedump，the error msg says “cuda illegal memory access was encountered”.
i find the reason is the FusedMHARunnerFP16v2 dose not support concurrent running.

To reproduce

attention_bug_fix.txt

this is my fix code

Urgency

No response

Platform

Linux

OS Version

1.18.0

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.0 master

ONNX Runtime API

C++

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

zwyao · 2024-09-29T07:03:57Z

i find the bug dose not be fixed in the latest version 1.19.1

tianleiwu · 2024-09-30T04:18:45Z

@zwyao,
The thread-safe for self attention FusedMHARunnerFP16v2 was fixed in #21420. There was another fix for cross-attention.
The bug was resolved in 1.19.0 release. Please try 1.19.2.

zwyao · 2024-09-30T06:15:22Z

@zwyao, The thread-safe for self attention FusedMHARunnerFP16v2 was fixed in #21420. There was another fix for cross-attention. The bug was resolved in 1.19.0 release. Please try 1.19.2.

emmm, thanks

tianleiwu closed this as completed Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FusedMHARunnerFP16v2 will cause onnxruntime coredump when multi-host-threads run session.run() #22262

FusedMHARunnerFP16v2 will cause onnxruntime coredump when multi-host-threads run session.run() #22262

zwyao commented Sep 29, 2024

zwyao commented Sep 29, 2024

tianleiwu commented Sep 30, 2024 •

edited

Loading

zwyao commented Sep 30, 2024

FusedMHARunnerFP16v2 will cause onnxruntime coredump when multi-host-threads run session.run() #22262

FusedMHARunnerFP16v2 will cause onnxruntime coredump when multi-host-threads run session.run() #22262

Comments

zwyao commented Sep 29, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

zwyao commented Sep 29, 2024

tianleiwu commented Sep 30, 2024 • edited Loading

zwyao commented Sep 30, 2024

tianleiwu commented Sep 30, 2024 •

edited

Loading