Skip to content

Commit

Permalink
Prepare for release 0.5.1
Browse files Browse the repository at this point in the history
  • Loading branch information
lukstafi committed Jan 1, 2025
1 parent 9ba7621 commit d8f264d
Show file tree
Hide file tree
Showing 6 changed files with 12 additions and 11 deletions.
2 changes: 1 addition & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## [0.5.1] -- 2024-12-31
## [0.5.1] -- 2025-01-01

## Added

Expand Down
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@ IMPORTANT: due to potential bugs, debug logging from CUDA in complex settings cu

This is very tentative.

* 0.5.1: Automatic synchronization for transfers between host and devices where unambiguous.
* 0.5.2: Apple Metal backend.
* 0.6: Replicate the scaffolding from [llm.c](https://github.com/karpathy/llm.c) for training GPT-2.
* More of primitive numeric operations.
Expand Down Expand Up @@ -96,11 +95,13 @@ This is very tentative.

For more details, see [CHANGES](CHANGES.md).

* **0.5: Stream-to-stream synchronization at the buffer level.**
* Support for CUDA events, and `Condition`-based events for CPU backends.
* Overhaul of the backend interfaces, both user-facing but especially internal: full code sharing.
* Automatic stream-to-stream synchronization on a per-tensor-node basis.
* **0.4.1 Half precision, mixed precision, CUDA virtual devices** (virtual devices renamed to streams in 0.4.2)
* **0.5: Synchronization and automation at the buffer level.**
* **0.5.1: Automatic synchronization for transfers between host and devices.**
* **0.5.0: Stream-to-stream synchronization at the buffer level.**
* Support for CUDA events, and `Condition`-based events for CPU backends.
* Overhaul of the backend interfaces, both user-facing but especially internal: full code sharing.
* Automatic stream-to-stream synchronization on a per-tensor-node basis.
* **0.4.1 Half precision, mixed precision, CUDA virtual devices** (virtual devices renamed to streams in 0.5.0)
* Half precision. Maybe improvements for mixed-precision computations.
* Resolve remaining issues with the new scheduler.
* Initial version of [lib/nn_blocks.ml](lib/nn_blocks.ml).
Expand Down
2 changes: 1 addition & 1 deletion arrayjit.opam
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This file is generated by dune, edit dune-project instead
opam-version: "2.0"
version: "0.5.0"
version: "0.5.1"
synopsis:
"An array language compiler with multiple backends (CPU, CUDA), staged compilation"
description:
Expand Down
2 changes: 1 addition & 1 deletion dune-project
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

(name ocannl)

(version 0.5.0)
(version 0.5.1)

(generate_opam_files true)

Expand Down
2 changes: 1 addition & 1 deletion neural_nets_lib.opam
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This file is generated by dune, edit dune-project instead
opam-version: "2.0"
version: "0.5.0"
version: "0.5.1"
synopsis:
"A from-scratch Deep Learning framework with an optimizing compiler, shape inference, concise syntax"
description:
Expand Down
2 changes: 1 addition & 1 deletion ocannl_npy.opam
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This file is generated by dune, edit dune-project instead
opam-version: "2.0"
version: "0.5.0"
version: "0.5.1"
synopsis: "Numpy file format support for ocaml"
maintainer: ["Lukasz Stafiniak <[email protected]>"]
authors: ["Laurent Mazare"]
Expand Down

0 comments on commit d8f264d

Please sign in to comment.