You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the performance kernel, off_h_k should be identical to the second index of PID, but the value got increased by one.
No code in the Triton kernel has performed such change. Neither branch of MQA/GQA
Problem Description
triton-issue-604.tar.gz
PYTORCH_NO_HIP_MEMORY_CACHING=1 HSA_SVM_GUARD_PAGES=1 HSA_DISABLE_FRAGMENT_ALLOCATOR=1 AMD_SERIALIZE_KERNEL=3 python triton-issue-604/test_backward.py
Will trigger segfault on Triton Commit00e09cf3008b86978f25f838659698e4a0bf6f45
If uncommentting out
tl.device_print
b/w LN 154-157 inside thefwd_kernel.py
file, and runWill get the following weird output
For the performance kernel,
off_h_k
should be identical to the second index of PID, but the value got increased by one.No code in the Triton kernel has performed such change. Neither branch of MQA/GQA
should increase off_h_k to 8 from 7
It's either a compiler problem, or the
tl.device_print
gets broken.Operating System
Ubuntu 20.04.6 LTS
CPU
AMD EPYC 7542
GPU
AMD Instinct MI210
ROCm Version
ROCm 6.1.0
ROCm Component
No response
Steps to Reproduce
See Description.
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: