Releases · JuliaGPU/CUDA.jl

Allow copy(::RNG) (#1719) (@mcabbott)
Update manifest (#1722) (@github-actions[bot])
Simplify CuError rendering before library initialization. (#1723) (@maleadt)
Simplify CuError rendering before library initialization (master branch version) (#1724) (@maleadt)
Make device RNG test more robust. (#1725) (@maleadt)
Rely on LLVM.jl's typed_ccall for more intrinsics. (#1728) (@maleadt)
Backports for 3.13 (#1729) (@maleadt)
Simplify CUBLAS and CUSPARSE wrappers, reducing code generated. (#1730) (@maleadt)
Add Julia 1.9 CI. (#1731) (@maleadt)
Use released dependencies. (#1732) (@maleadt)
Remove NVTX. (#1733) (@maleadt)
Introduce cuFFT plan cache; switch to auto-managed memory. (#1734) (@maleadt)
Stop pirating GPUArrays' RNG methods. (#1735) (@maleadt)

Contributors

maleadt and mcabbott

Assets 2

19 Jan 17:00

github-actions

v3.13.0

1a52af1

v3.13.0

CUDA v3.13.0

Diff since v3.12.1

Closed issues:

Error during CUDA test (#1718)
Kernel error from bad broadcast (should be regular error?) (#1720)
Freeze into StackOverflow when JULIA_DEBUG=CUDA set (#1721)
Use of linear operators in CUDA.jl (#1727)

Merged pull requests:

Allow copy(::RNG) (#1719) (@mcabbott)
Update manifest (#1722) (@github-actions[bot])
Simplify CuError rendering before library initialization. (#1723) (@maleadt)
Simplify CuError rendering before library initialization (master branch version) (#1724) (@maleadt)
Make device RNG test more robust. (#1725) (@maleadt)
Rely on LLVM.jl's typed_ccall for more intrinsics. (#1728) (@maleadt)
Backports for 3.13 (#1729) (@maleadt)
Simplify CUBLAS and CUSPARSE wrappers, reducing code generated. (#1730) (@maleadt)
Add Julia 1.9 CI. (#1731) (@maleadt)
Use released dependencies. (#1732) (@maleadt)
Remove NVTX. (#1733) (@maleadt)

Contributors

maleadt and mcabbott

Assets 2

06 Jan 07:34

github-actions

v3.12.1

b7bdc79

v3.12.1

CUDA v3.12.1

Diff since v3.12.0

Closed issues:

Accumulate doesn't work on >=4 dim Arrays with dims <= ndims(A) - 3 (#1039)
CUSPARSE does not support dense-sparse matrix multiplication (#1403)
Scalar indexing when comparing a CuArray to the identity matrix (#1557)
CUBLAS_STATUS_NOT_INITIALIZED (#1567)
LinearAlgebra./ and LinearAlgebra.\ breaks CuArray (#1568)
Window size in grid-stride loop (#1573)
Matrix multiplication works for primitive and non-primitive custom number types on the CPU, but it fails for primitive custom number types on the GPU. (#1574)
CuIterator doesn't specify IteratorSize but has no length() (#1583)
Garbage collection doesn't work as shown in the documentation (#1586)
Adding sparse adjoint results in kernel error (#1591)
sparse - sparse matrix multiplication partially missing (#1599)
FastMath sincos(), cis(), exp(im..) aren't as fast as C++ (#1606)
wrong type in wrapper of a cusolver function (#1621)
Adding CUDNN support for 3D convolutions/cross-correlations (#1631)
copyto! does not work between a CuArray and a view(Array) (#1634)
Minor issue with sparse function (#1641)
Scalar indexing when displaying Diagonal{Int64, CuSparseVector{Int64, Int32}} (#1645)
Many errors running test suite on GTX 960 4GB (#1650)
Driver discovery broken on platforms without compat driver (#1653)
Aliasing/Polluted Result from rfftplan for Float32 2^n 3D array (#1656)
Re-instate memory limit (#1670)
Split libnvToolsExt from CUDA_Runtime_jll? (#1672)
accumulate(op, a) causes scalar indexing (#1680)
CUSPARSE CI failures (#1692)
axpy! for nested base types (reshapedarray/adjoint/view) (#1696)
copyto! between a PermutedDimsArray view and a CuArray doesn't work (#1697)
WMMA test failure (#1700)
UndefVarError when a binary is not found (#1701)
Is CUSPARSELT supported? (#1702)
Best practices to reduce startup time (#1707)
1.9 compatibility (#1710)
WARNING: unused variadic paramters. (#1712)

Merged pull requests:

Remove/rework CuDeviceArray constructors (#1308) (@maleadt)
Add always_inline kernel parameter (#1554) (@lcw)
Update manifest (#1564) (@github-actions[bot])
Update manifest (#1569) (@github-actions[bot])
Update manifest (#1571) (@github-actions[bot])
Fix native RNG window calculation. (#1575) (@maleadt)
Use Base.active_project. (#1576) (@maleadt)
Fixes for and tests using JET. (#1577) (@maleadt)
Update manifest (#1578) (@github-actions[bot])
Docs, remove global variables in intro benchmark (#1580) (@SteffenPL)
Update manifest (#1581) (@github-actions[bot])
Update manifest (#1582) (@github-actions[bot])
Bugfixes when using \ operator with non square matrices (#1584) (@GVigne)
remove unbound type parameters (#1585) (@nsajko)
added --openacc-profiling off to the nvprof (#1587) (@mbeltagy)
Update manifest (#1588) (@github-actions[bot])
Wrap at-cuda's code in a let block. (#1589) (@maleadt)
Revert: Use JET during test suite. (#1590) (@maleadt)
[CUSPARSE] Update mv! and mm! functions for CuSparseMatrixCOO and CuSparseMatrixCSC (#1592) (@amontoison)
[CUSPARSE] Add sv! and sm! routines (#1593) (@amontoison)
CompatHelper: bump compat for "BFloat16s" to "0.3" (#1594) (@github-actions[bot])
Update wrap.jl (#1595) (@amontoison)
Provide more useful explanation why an eltype is unsupported. (#1596) (@maleadt)
CompatHelper: bump compat for "BFloat16s" to "0.4" (#1597) (@github-actions[bot])
Improve eltype error reporting. (#1598) (@maleadt)
Add () at the end of the library name in all ccall (#1600) (@amontoison)
Define length for CuIterator (#1602) (@mcabbott)
Added more sparse functions like: kron, tril, triu, reshape, adjoint, transpose, sparse-sparse multiplication (#1603) (@albertomercurio)
Fix rotate! and reflect! for the generic fallback in GPUArrays.jl (#1604) (@amontoison)
Update manifest (#1605) (@github-actions[bot])
Update manifest (#1609) (@github-actions[bot])
[CUSPARSE] Interface generic routines (#1611) (@amontoison)
[CUSPARSE] Update sparse-sparse GEMM (#1613) (@amontoison)
[CUSPARSE] Add sddmm! and gemvi! routines (#1615) (@amontoison)
Update manifest (#1616) (@github-actions[bot])
Don't use isbitsunion to support structs of union types. (#1617) (@maleadt)
Update CUDA driver compatibility package to 11.8. (#1618) (@maleadt)
Update CUDA artifacts to 11.7 Update 1. (#1619) (@maleadt)
Update to CUDA 11.8 (#1620) (@maleadt)
Update to CUDNN 8.6. (#1622) (@maleadt)
Move CUDNN and CUTENSOR into separate packages (#1624) (@maleadt)
Bump BFloat16s. (#1625) (@maleadt)
fix #1621 (#1626) (@jemiryguo)
Restore functionality of FastMath.sincos. (#1627) (@maleadt)
Update manifest (#1628) (@github-actions[bot])
Switch from manual artifact handling to automated JLLs (#1629) (@maleadt)
[CUSPARSE] Add CuMatrix * CuSparseMatrix products (#1632) (@amontoison)
Silence some test warnings. (#1635) (@maleadt)
Update CUTENSOR to v1.6 (#1636) (@maleadt)
[CUSPARSE] Add SparseMatrix * SparseVector products (#1637) (@amontoison)
Upgrade CUSTATEVEC to v1.1 (#1638) (@maleadt)
Upgrade CUTENSORNET to v1.1 (#1639) (@maleadt)
[CUSPARSE] Add CuSparseVector ± CuSparseVector (#1640) (@amontoison)
CompatHelper: add new compat entry for "Preferences" at version "1" (#1642) (@github-actions[bot])
Fix #1641 (#1643) (@amontoison)
Update manifest (#1646) (@github-actions[bot])
[CUSPARSE] Add dot(CuSparseVector,CuVector) and vice-versa (#1647) (@amontoison)
[CUSPARSE] Add ldiv! for CuSparseMatrixCOO and geam for CuSparseMatrixCSC (#1648) (@amontoison)
Update autogenerated headers (#1649) (@maleadt)
Remove deprecations (#1651) (@maleadt)
Don't warn about the old JULIA_CUDA_USE_BINARYBUILDER env var when using preferences (#1652) (@maleadt)
Update CUTENSORNET to use new slice group (#1654) (@kshyatt)
[CUSPARSE] Fix conversions between CuSparseMatrixCOO and CuSparseMatrixCSC (#1655) (@amontoison)
Include compiler options in error log. (#1657) (@maleadt)
Discover the system driver when CUDA_Driver_jll isn't available. (#1658) (@maleadt)
Preserve buffer type when adapting to CuArray. (#1659) (@maleadt)
Update manifest (#1661) (@github-actions[bot])
Extend conversion of QRPackedQ object to CuArray (#1662) (@GVigne)
[CUSPARSE] Add CuSparseMatrixCSC * CuSparseMatrixCSC (#1663) (@amontoison)
Update manifest (#1665) (@github-actions[bot])
[CUSPARSE] Add more tests (#1668) (@amontoison)
Update manifest (#1671) (@github-actions[bot])
Update manifest (#1676) (@github-actions[bot])
Fix eigen when using Hermitian or Symmetric matrices (#1677) (@GVigne)
Update manifest (#1679) (@github-actions[bot])
adding defaults for accumulate(op, a) with modified code from Base.accumulate (#1681) (@leios)
Add right division operator for Diagonal matrices (#1683) (@GVigne)
Update manifest (#1686) (@github-actions[bot])
Bump CUQUANTUM libraries (#1688) (@maleadt)
typo (#1689) (@ArnoStrouwen)
Retry CUSOLVER handle creation when encountering an internal error. (#1691) (@maleadt)
Fix #1692 (#1693) (@amontoison)
Update manifest (#1694) (@github-actions[bot])
[CUSPARSE] Support kron with Diagonal arguments (#1695) (@albertomercurio)
Re-introduce memory limits. (#1698) (@maleadt)
Adapt to GPUCompiler changes. (#1699) (@maleadt)
WMMA: Don't wrap fragments of size 1 in a struct. (#1704) (@maleadt)
Update manifest (#1708) (@github-actions[bot])
Use plain llvmcall calling convention for WMMA intrinsics. (#1709) (@maleadt)
Reclaim in cuDNN conv algorithm search (#1711) (@ToucheSir)
CUBLAS: test against generic axp(b)y, not the BLAS-specific one. (#1713) (@maleadt)
Fix LU getproperty invoke. (#1714) (@maleadt)
Backports for 3.12.1 (#1715) (@maleadt)
Specialize cholcopy to avoid scalar indexing. (#1716) (@maleadt)
Fix handling of inline-allocated structures with unions. (#1717) (@maleadt)

Contributors

lcw, maleadt, and 12 other contributors

Assets 2

16 Jul 21:40

github-actions

v3.12.0

3729010

v3.12.0

CUDA v3.12.0

Diff since v3.11.0

Closed issues:

Implement Base.repeat (#177)
repeat performs scalar indexing for multi-dimensional arrays (#1051)
The GPU compiler fails on a call to maximum (#1548)
versioninfo triggers artifact downloads (#1549)
Error when broadcasting composed functions (#1550)
overload Base.copy! for AbstractGPUArray{<:Any,1} (#1555)

Merged pull requests:

Fix math quirk. (#1546) (@maleadt)
Wrap cusolverRf.h and cusolverSp_LOWLEVEL_PREVIEW.h (#1547) (@frapac)
Update manifest (#1551) (@github-actions[bot])
tighten unsafe_wrap signature on scalar length (#1552) (@sjkelly)
Update Documenter key. (#1553) (@maleadt)
Update manifest (#1556) (@github-actions[bot])
Import factorisation internal types from LinearAlgebra (#1558) (@theabhirath)
Update manifest (#1560) (@github-actions[bot])
add reshape for CuDeviceArray (#1561) (@omlins)

Contributors

maleadt, sjkelly, and 3 other contributors

Assets 2

15 Jun 10:29

github-actions

v3.11.0

15a0e1d

v3.11.0

CUDA v3.11.0

Diff since v3.10.1

Closed issues:

CUSPARSE: Diagonal + CSC/CSR gives dense array (#1469)
CUBLAS: Multiplication of UpperTriangular/LowerTriangular not supported (#1486)
CUTENSOR tests consume lots of memory, breaking other tests (#1501)
CUFFT doesn't work for ComplexF64 C2C in-place (#1519)
Inconsistency of == and isequal for CuArray (#1524)
Setting CUDA seed the first time changes Random's RNG non-deterministically (#1526)
Undefined exported symbols (#1527)
Could not load library libLLVMExtra-14.dll (#1535)
Add an rrule for cholesky to CUDA.jl (#1541)

Merged pull requests:

specialize +/- op for sparse diag (#1514) (@Roger-luo)
Make sure instantiating RNGs doesn't affect the global CPU RNG. (#1530) (@maleadt)
Update manifest (#1531) (@github-actions[bot])
ldiv! for LU Decomposition (#1532) (@SBuercklin)
Lower dmax for contraction tests (#1534) (@kshyatt)
Fix convolution algorithm search (#1536) (@maxfreu)
Update manifest (#1537) (@github-actions[bot])
add specializations for some triangular-triangular multiplications (#1538) (@Red-Portal)
Add a utility to download artifacts without a functional driver. (#1539) (@maleadt)
Update manifest (#1543) (@github-actions[bot])
Explicit tests for type conversion (#1544) (@kshyatt)
Remove unused exports. (#1545) (@maleadt)

Contributors

maleadt, kshyatt, and 4 other contributors

Assets 2

27 May 20:19

github-actions

v3.10.1

49902d8

v3.10.1

CUDA v3.10.1

Diff since v3.10.0

Closed issues:

Overflow in randn using CUDA.jl's native RNG (#1464)
Segmentation fault with pre-compiled library importing CUDA (#1465)
Julia freezes when using Polynomials with CuArray (#1497)
Launch overhead regression (#1503)
CUSOLVER: Matrix division requires identical types (#1512)
Incorrect distribution for complex standard normals when using CUDA.default_rng() (#1515)
loggamma (#1528)

Merged pull requests:

CUSPARSE: Support mixed type mv (#1475) (@Roger-luo)
Add method for LinearAlgebra.opnorm2 (#1516) (@danielwe)
Promote to common eltype in matrix division (#1517) (@danielwe)
Fix Box-Muller transformation for complex eltypes (#1518) (@danielwe)
Update manifest (#1521) (@github-actions[bot])
Use at-dispose for LLVM.jl resource cleanup. (#1523) (@maleadt)
loggamma (#1529) (@cossio)

Contributors

maleadt, danielwe, and 2 other contributors

Assets 2

16 May 18:58

github-actions

v3.10.0

044bd98

v3.10.0

CUDA v3.10.0

Diff since v3.9.1

Closed issues:

Error while freeing DeviceBuffer-warning when using multiple GPUs (#1454)
CUDNN cache locking prevents finalizers resulting in OOMs (#1461)
EOFError from pool_cleanup when closing REPL (#1495)
TypeError in compiler with custom kernel (#1496)

Merged pull requests:

expose sparse mv/mm algo selection (#1201) (@Roger-luo)
Always inspect the task-local context when verifying before freeing. (#1462) (@maleadt)
support sparse opnorm (#1466) (@Roger-luo)
Move CUSTATEVEC and CUTENSORNET into lib/ (#1478) (@vchuravy)
Adapt to GPUCompiler 0.15 changes (#1488) (@maleadt)
Limit time held by CUDNN locks. (#1491) (@maleadt)
Docstring for cu (#1493) (@mcabbott)
Update manifest (#1499) (@github-actions[bot])
Silence EOFError in pool_cleanup (#1502) (@Octogonapus)
Adapt to GPUCompiler changes (#1504) (@maleadt)
Fixes for CUSPARSE 11.7.1. (#1505) (@maleadt)
Update artifacts (#1507) (@maleadt)
Update manifest (#1509) (@github-actions[bot])
Add a new cache for HostKernel objects. (#1510) (@maleadt)

Contributors

vchuravy, maleadt, and 3 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

CUDA v4.0.0

Contributors

CUDA v3.13.1

Contributors

CUDA v3.12.2

Contributors

CUDA v3.13.0

Contributors

CUDA v3.12.1

Contributors

CUDA v3.12.0

Contributors

CUDA v3.11.0

Contributors

CUDA v3.10.1

Contributors

CUDA v3.10.0

Contributors

Releases: JuliaGPU/CUDA.jl

v4.0.1

What's Changed

Contributors

v4.0.0

CUDA v4.0.0

Contributors

v3.13.1

CUDA v3.13.1

Contributors

v3.12.2

CUDA v3.12.2

Contributors

v3.13.0

CUDA v3.13.0

Contributors

v3.12.1

CUDA v3.12.1

Contributors

v3.12.0

CUDA v3.12.0

Contributors

v3.11.0

CUDA v3.11.0

Contributors

v3.10.1

CUDA v3.10.1

Contributors

v3.10.0

CUDA v3.10.0

Contributors