Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. · microsoft/onnxruntime@73679d3

misspell: onnxruntime/test/cuda_host/blkq4_fp16_quant_sm80.h#L102

[misspell] reported by reviewdog 🐶 "seperate" is a misspelling of "separate" Raw Output: ./onnxruntime/test/cuda_host/blkq4_fp16_quant_sm80.h:102:17: "seperate" is a misspelling of "separate"

misspell: onnxruntime/test/cuda_host/blkq4_fp16_quant_sm80.h#L171

[misspell] reported by reviewdog 🐶 "seperate" is a misspelling of "separate" Raw Output: ./onnxruntime/test/cuda_host/blkq4_fp16_quant_sm80.h:171:15: "seperate" is a misspelling of "separate"

misspell: onnxruntime/test/cuda_host/blkq4_fp16_quant_sm80.h#L238

[misspell] reported by reviewdog 🐶 "seperate" is a misspelling of "separate" Raw Output: ./onnxruntime/test/cuda_host/blkq4_fp16_quant_sm80.h:238:15: "seperate" is a misspelling of "separate"

misspell: onnxruntime/core/mickey/cutlass_ext/q4gemm/warp/quantb_meta_mma_tensor_op_tile_iterator.h#L150

[misspell] reported by reviewdog 🐶 "fragement" is a misspelling of "fragment" Raw Output: ./onnxruntime/core/mickey/cutlass_ext/q4gemm/warp/quantb_meta_mma_tensor_op_tile_iterator.h:150:30: "fragement" is a misspelling of "fragment"

misspell: onnxruntime/core/mickey/cutlass_ext/q4gemm/warp/quantb_meta_mma_tensor_op_tile_iterator.h#L178

[misspell] reported by reviewdog 🐶 "dimention" is a misspelling of "dimension" Raw Output: ./onnxruntime/core/mickey/cutlass_ext/q4gemm/warp/quantb_meta_mma_tensor_op_tile_iterator.h:178:18: "dimention" is a misspelling of "dimension"

misspell: onnxruntime/core/mickey/cutlass_ext/q4gemm/warp/quantb_meta_mma_tensor_op_tile_iterator.h#L184

[misspell] reported by reviewdog 🐶 "fragement" is a misspelling of "fragment" Raw Output: ./onnxruntime/core/mickey/cutlass_ext/q4gemm/warp/quantb_meta_mma_tensor_op_tile_iterator.h:184:11: "fragement" is a misspelling of "fragment"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #25753

Optional Lint

Adding cuda kernel (optimized for sm80) for block-wise 4b quantized float 16 GEMM. #25753

Jobs

Run details

Annotations

The logs for this run have expired and are no longer available.

Re-running jobs...