Releases: bmerry/clogs
Releases · bmerry/clogs
1.5.2
release-1.5.1
- Workaround for NVIDIA driver bug that prevents scan from autotuning on Pascal hardware
- Fix crash for systems with multiple NVIDIA GPUs
release-1.5.0
- Make tuning work on-the-fly instead of requiring up-front tuning
- Make the ABI robust against changes to OpenCL C++ bindings
- Add method overloads that take OpenCL C API handles
- Add free callback to setEventCallback functions
- Add support for arbitrary function objects to setEventCallback functions
- Allow algorithm objects to be default constructed, swapped, and moved
- Fix CLOGS_VERSION_MINOR
release-1.4.0
- Reduction has been added
- Introduced the ScanProblem and RadixsortProblem classes
- The cache is now stored in a SQLite database instead of lots of files
- The cache is now located in an XDG-compliant location on UNIX (~/.cache/clogs by default).
- The tuning caching mechanism has been significantly rewritten for use with SQLite
- All kernels generated during tuning are now cached (this can use a lot of space)
release-1.3.0
release-1.2.3
release-1.2.2
- Fix a race condition in radix sort (mostly affects CPU devices)
- Work around an AMD driver bug that caused segfaults in tuning
- Avoid passing defines with both -D and #define
release-1.2.1
- Performance improvements, particularly for AMD GPUs
- Added --keep-going option to clogs-tune, as a temporary work-around for #23
release-1.2.0
- Kernel parameters are now autotuned (refer to user manual)
- Added benchmark support for scan
- Fixed sorting in benchmark tool to support 3-element value types
- Improved robustness to non-default locale
- Added --split-debug and --variant=symbols configuration options
release-1.1.0
- Add setEventCallback methods to Scan and Radixsort
- Worked around a bug in the Intel OpenCL compiler