Skip to content

Example of binding a TF32 CUTLASS GEMM kernel to PyTorch

Notifications You must be signed in to change notification settings

bertmaher/tf32_gemm

Repository files navigation

Usage

python setup.py develop
python test.py
python benchmark.py

Optionally, do denoise-gpu.sh python test.py (or benchmark.py) for less noisy (but slower) results.

About

Example of binding a TF32 CUTLASS GEMM kernel to PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published