Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
DMMHA: add unit tests; fix CPU, CUDA kernel (microsoft#22567)
### Description Fixes: (1) cpu kernel: applying scale before bias and mask like other MHA ops (2) cpu kernel: correct offset during appending past to present. (3) cuda kernel: apply mask if provided; fix output_qk offset. Add DMMHA unit tests
- Loading branch information