Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator #21083
Azure Pipelines / orttraining-ortmodule-distributed (DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test)
succeeded
Jul 10, 2024 in 53m 48s
DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test succeeded
Loading