Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #18619
Azure Pipelines / Android CI Pipeline (BUILD_NNAPI_STAGE Build_NNAPI_EP)
succeeded
Feb 29, 2024 in 21m 39s
BUILD_NNAPI_STAGE Build_NNAPI_EP succeeded
Loading