Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator #21083
Azure Pipelines / Big Models (Llama2_7B_ONNX Llama2_7B_ONNX)
succeeded
Jul 10, 2024 in 1h 19m 1s
Llama2_7B_ONNX Llama2_7B_ONNX succeeded
Loading