Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #24358
Triggered via pull request
February 28, 2024 23:40
Status
Success
Total duration
1h 12m 47s
Artifacts
–
windows.yml
on: pull_request
Windows-CUDA-12
53m 14s
Onnxruntime-TVM
1h 12m