Skip to content

DeepWok/mase-cuda

Repository files navigation

MASE CUDA

This is the cuda extension for DeepWok/MASE.

Note

This project is still under development.

Note

Beginner Guide notes down my learning process and setup.

Conda Env Setup

  • C++17 (GCC < 14)
  • CUDA 12.4/12.5/12.6
  • CMake >= 3.20
  • Python >= 3.11
    • Tox for Python package building and testing.
    • Torch >= 2.3.0 (pip install torch in conda env) which includes LibTorch for wrapping CUDA kernels.
  • Justfile

C++/CUDA

Build Tests

  • Build Tests

    just build-cu-test
  • Build Profiling for NSight Compute

    just build-cu-profile

Build Specific Target

  • Build test_mxint8_dequantize1d for debug and launch cuda-gdb for debugging

    just --set CU_BUILD_TARGETS test_mxint8_dequantize1d build-cu-test-debug
    
    cuda-gdb --args ./build/test/cu/mxint/dequantize/test_mxint8_dequantize1d 25600 256

Python

tox sets up the python environment automatically.

  • Build mase_cuda package and run python tests

    tox # this is slow since cpu & gpu performance profiling is enabled
    • Run quick test

      just test-py-fast
    • The package is built in dist/ directory.

  • Create env for dev

    tox -e dev
  • Build mase_cuda package

    just build-py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published