This repository has been archived by the owner on Aug 30, 2024. It is now read-only.
Enable runtime gpu_arch auto-select based on devices where kernels are executing for gemm_int4 tests; enable device-specific compilation using USE_XETLA (xe_lpg, xe_hpg, xe_hpc). #302
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Type of Change
Change #1: Enable runtime gpu_arch auto-select based on devices where kernels are executing for gemm_int4 tests.
Change #2: enable device-specific compilation using USE_XETLA (xe_lpg, xe_hpg, xe_hpc) to address the current messy issue of "tests not matching device type"
Description
template template class to wrap <gpu_arch, mma_engine> for runtime gpu_arch auto-select based on devices where kernels are executing.
USE_XETLA options for compilation on different devices (xe_lpg, xe_hpg, xe_hpc).
Expected Behavior & Potential Risk
No foreseeable risk related to CMake Compliation / code execution.
How has this PR been tested?
tested on mtl/dg2
Dependency Change?
No Libraries changed.