New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator #21083

Open

chenfucn wants to merge 5 commits into microsoft:main from chenfucn:cfu_transform_prepack

Open

Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator #21083

sm version guard

Azure Pipelines / orttraining-ortmodule-distributed (DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test) succeeded Jul 10, 2024 in 53m 48s

DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test succeeded

0 errors / 1 warnings

View more details on Azure Pipelines