Packed QKV and Rotary Embedding Support for sm<80 GQA #20012
Merged
Azure Pipelines / orttraining-ortmodule-distributed (DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test)
succeeded
Mar 23, 2024 in 32m 40s
DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test succeeded
Loading