========================= Release Notes v0.3.0 =========================
API:
- H2 added a thin GPU portability layer that supports CUDA and ROCm
(HIP) programming environments. - Added a spdlog-based logging capability.
- Added initial support for (sequential) tensors.
Performance optimizations:
Internal features:
- MIOpen is supported as a backend for "DNN compute kernels" in
DistConv. - DistConv is supported on ROCm systems.
- Added support for Gitlab CI.
- Support for DaCe convolution kernels in DistConv
Build system:
- Several updates to support ROCm 5.*.
- Requires CMake >= 3.21.
- Requires ROCm 5, with the latest available minor release being
STRONGLY recommended, when targeting ROCm-based platforms. The
current packages we need in this stack are:
-- HIP (host)
-- hipCUB
-- ROCm-SMI
-- MIOpen
-- ROC-tracer
Bug fixes:
- Fixed a bug in kernel selection for batchnorm gradients in the case of
overlap/strides.
Retired features: