Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #18619
Azure Pipelines / orttraining-amd-gpu-ci-pipeline (Linux_Build_ubuntu)
succeeded
Feb 29, 2024 in 20m 58s
Linux_Build_ubuntu succeeded
Loading