Skip to content

Releases: lukstafi/ocaml-cudajit

0.6.1: bug fixes and reporting stream-related information

03 Dec 14:41
Compare
Choose a tag to compare

From the changelog:

Added

  • Debug number of total and unreleased events in a stream.
  • Debug the total number of non-garbage-collected streams across all devices.
  • Verifies that compilation options fit in a set of characters: alphanumeric and a few interpunction.

Fixed

  • Docu-comment typo.
  • The flags cu_event_wait_external and cu_event_wait_default were switched around for record ?external_ and wait ?external_ event functions.
  • Don't destroy released (destroyed) events in Delimited_event.synchronize.

0.6.1: bug fixes and reporting stream-related information

03 Dec 11:12
Compare
Choose a tag to compare

From the changelog:

Added

  • Debug number of total and unreleased events in a stream.
  • Debug the total number of non-garbage-collected streams across all devices.
  • Verifies that compilation options fit in a set of characters: alphanumeric and a few interpunction.

Fixed

  • Docu-comment typo.
  • The flags cu_event_wait_external and cu_event_wait_default were switched around for record ?external_ and wait ?external_ event functions.
  • Don't destroy released (destroyed) events in Delimited_event.synchronize.

0.6.0: detecting use-after-free for memory pointers

01 Nov 09:56
Compare
Choose a tag to compare

This release focuses on detecting use-after-free for memory pointers, and improving ease of debugging.
Moreover, added functions: Device.get_free_and_total_mem, Stream.mem_alloc, Stream.mem_free.

0.5.0: modular interface, events, finalizers

30 Sep 19:15
Compare
Choose a tag to compare

In this release:

  • we split the API into modules: Nvrtc, Device, Context, Deviceptr, Module, Stream;
  • we support CUDA events via Event and a wrapper module Delimited_event that manages destroying events;
  • we manage the primary context, created contexts, streams and events via finalizers.

0.4.1: support for half precision

12 Sep 10:10
Compare
Choose a tag to compare

In this release:

  • We pass the $CUDA_PATH/include path to the nvrtc compiler; otherwise e.g. #include <cuda_fp16.h> will not work. The user could already be doing this, but since we monitor the installation via conf-cuda, it's better to prepend the option automatically.
  • We work around ctypes not supporting the Float16 type.

0.4.0

21 Jul 20:51
Compare
Choose a tag to compare

For details, see CHANGES.md. Highlights:

  • Context flags.
  • Asynchronous operations and streams.
  • Bug fix: kernel launch params lifetimes.
  • Interface file with documentation.
  • Self-contained types in the interface.
  • Updated to a newer CUDA version.

0.1.1.0

09 May 13:16
Compare
Choose a tag to compare

From the changelog:

[0.1.1] 2024-05-09

Added

  • Continuous Integration on GitHub thanks to GitHub action Jimver/cuda-toolkit, but only PTX compilation.

Fixed

  • Test target should erase compiler versions.

[0.1.0] 2023-10-28

Added

Fixed

  • To be defensive, pass -I and -L arguments to the compiler and linker with the default paths on linux-like systems.