Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 342 Bytes

README.md

File metadata and controls

20 lines (14 loc) · 342 Bytes

CUDA Kernels on V100

Few CUDA Kernels on V100. Mainly used to demonstrate optimization methods.

For minimal dependency requirement, use Makefile to build all executables.

File structure

// reduce operation
reduce/

// Scan operation
scan/

// Square matrix transpose
transpose/

// General matrix multiply C = A * B
sgemm/