Skip to content

Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #21827

Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM.

Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #21827

Analyze (python)

succeeded Feb 28, 2024 in 4m 47s