Skip to content

v3.12.1

Compare
Choose a tag to compare
@github-actions github-actions released this 06 Jan 07:34
b7bdc79

CUDA v3.12.1

Diff since v3.12.0

Closed issues:

  • Accumulate doesn't work on >=4 dim Arrays with dims <= ndims(A) - 3 (#1039)
  • CUSPARSE does not support dense-sparse matrix multiplication (#1403)
  • Scalar indexing when comparing a CuArray to the identity matrix (#1557)
  • CUBLAS_STATUS_NOT_INITIALIZED (#1567)
  • LinearAlgebra./ and LinearAlgebra.\ breaks CuArray (#1568)
  • Window size in grid-stride loop (#1573)
  • Matrix multiplication works for primitive and non-primitive custom number types on the CPU, but it fails for primitive custom number types on the GPU. (#1574)
  • CuIterator doesn't specify IteratorSize but has no length() (#1583)
  • Garbage collection doesn't work as shown in the documentation (#1586)
  • Adding sparse adjoint results in kernel error (#1591)
  • sparse - sparse matrix multiplication partially missing (#1599)
  • FastMath sincos(), cis(), exp(im..) aren't as fast as C++ (#1606)
  • wrong type in wrapper of a cusolver function (#1621)
  • Adding CUDNN support for 3D convolutions/cross-correlations (#1631)
  • copyto! does not work between a CuArray and a view(Array) (#1634)
  • Minor issue with sparse function (#1641)
  • Scalar indexing when displaying Diagonal{Int64, CuSparseVector{Int64, Int32}} (#1645)
  • Many errors running test suite on GTX 960 4GB (#1650)
  • Driver discovery broken on platforms without compat driver (#1653)
  • Aliasing/Polluted Result from rfftplan for Float32 2^n 3D array (#1656)
  • Re-instate memory limit (#1670)
  • Split libnvToolsExt from CUDA_Runtime_jll? (#1672)
  • accumulate(op, a) causes scalar indexing (#1680)
  • CUSPARSE CI failures (#1692)
  • axpy! for nested base types (reshapedarray/adjoint/view) (#1696)
  • copyto! between a PermutedDimsArray view and a CuArray doesn't work (#1697)
  • WMMA test failure (#1700)
  • UndefVarError when a binary is not found (#1701)
  • Is CUSPARSELT supported? (#1702)
  • Best practices to reduce startup time (#1707)
  • 1.9 compatibility (#1710)
  • WARNING: unused variadic paramters. (#1712)

Merged pull requests:

  • Remove/rework CuDeviceArray constructors (#1308) (@maleadt)
  • Add always_inline kernel parameter (#1554) (@lcw)
  • Update manifest (#1564) (@github-actions[bot])
  • Update manifest (#1569) (@github-actions[bot])
  • Update manifest (#1571) (@github-actions[bot])
  • Fix native RNG window calculation. (#1575) (@maleadt)
  • Use Base.active_project. (#1576) (@maleadt)
  • Fixes for and tests using JET. (#1577) (@maleadt)
  • Update manifest (#1578) (@github-actions[bot])
  • Docs, remove global variables in intro benchmark (#1580) (@SteffenPL)
  • Update manifest (#1581) (@github-actions[bot])
  • Update manifest (#1582) (@github-actions[bot])
  • Bugfixes when using \ operator with non square matrices (#1584) (@GVigne)
  • remove unbound type parameters (#1585) (@nsajko)
  • added --openacc-profiling off to the nvprof (#1587) (@mbeltagy)
  • Update manifest (#1588) (@github-actions[bot])
  • Wrap at-cuda's code in a let block. (#1589) (@maleadt)
  • Revert: Use JET during test suite. (#1590) (@maleadt)
  • [CUSPARSE] Update mv! and mm! functions for CuSparseMatrixCOO and CuSparseMatrixCSC (#1592) (@amontoison)
  • [CUSPARSE] Add sv! and sm! routines (#1593) (@amontoison)
  • CompatHelper: bump compat for "BFloat16s" to "0.3" (#1594) (@github-actions[bot])
  • Update wrap.jl (#1595) (@amontoison)
  • Provide more useful explanation why an eltype is unsupported. (#1596) (@maleadt)
  • CompatHelper: bump compat for "BFloat16s" to "0.4" (#1597) (@github-actions[bot])
  • Improve eltype error reporting. (#1598) (@maleadt)
  • Add () at the end of the library name in all ccall (#1600) (@amontoison)
  • Define length for CuIterator (#1602) (@mcabbott)
  • Added more sparse functions like: kron, tril, triu, reshape, adjoint, transpose, sparse-sparse multiplication (#1603) (@albertomercurio)
  • Fix rotate! and reflect! for the generic fallback in GPUArrays.jl (#1604) (@amontoison)
  • Update manifest (#1605) (@github-actions[bot])
  • Update manifest (#1609) (@github-actions[bot])
  • [CUSPARSE] Interface generic routines (#1611) (@amontoison)
  • [CUSPARSE] Update sparse-sparse GEMM (#1613) (@amontoison)
  • [CUSPARSE] Add sddmm! and gemvi! routines (#1615) (@amontoison)
  • Update manifest (#1616) (@github-actions[bot])
  • Don't use isbitsunion to support structs of union types. (#1617) (@maleadt)
  • Update CUDA driver compatibility package to 11.8. (#1618) (@maleadt)
  • Update CUDA artifacts to 11.7 Update 1. (#1619) (@maleadt)
  • Update to CUDA 11.8 (#1620) (@maleadt)
  • Update to CUDNN 8.6. (#1622) (@maleadt)
  • Move CUDNN and CUTENSOR into separate packages (#1624) (@maleadt)
  • Bump BFloat16s. (#1625) (@maleadt)
  • fix #1621 (#1626) (@jemiryguo)
  • Restore functionality of FastMath.sincos. (#1627) (@maleadt)
  • Update manifest (#1628) (@github-actions[bot])
  • Switch from manual artifact handling to automated JLLs (#1629) (@maleadt)
  • [CUSPARSE] Add CuMatrix * CuSparseMatrix products (#1632) (@amontoison)
  • Silence some test warnings. (#1635) (@maleadt)
  • Update CUTENSOR to v1.6 (#1636) (@maleadt)
  • [CUSPARSE] Add SparseMatrix * SparseVector products (#1637) (@amontoison)
  • Upgrade CUSTATEVEC to v1.1 (#1638) (@maleadt)
  • Upgrade CUTENSORNET to v1.1 (#1639) (@maleadt)
  • [CUSPARSE] Add CuSparseVector ± CuSparseVector (#1640) (@amontoison)
  • CompatHelper: add new compat entry for "Preferences" at version "1" (#1642) (@github-actions[bot])
  • Fix #1641 (#1643) (@amontoison)
  • Update manifest (#1646) (@github-actions[bot])
  • [CUSPARSE] Add dot(CuSparseVector,CuVector) and vice-versa (#1647) (@amontoison)
  • [CUSPARSE] Add ldiv! for CuSparseMatrixCOO and geam for CuSparseMatrixCSC (#1648) (@amontoison)
  • Update autogenerated headers (#1649) (@maleadt)
  • Remove deprecations (#1651) (@maleadt)
  • Don't warn about the old JULIA_CUDA_USE_BINARYBUILDER env var when using preferences (#1652) (@maleadt)
  • Update CUTENSORNET to use new slice group (#1654) (@kshyatt)
  • [CUSPARSE] Fix conversions between CuSparseMatrixCOO and CuSparseMatrixCSC (#1655) (@amontoison)
  • Include compiler options in error log. (#1657) (@maleadt)
  • Discover the system driver when CUDA_Driver_jll isn't available. (#1658) (@maleadt)
  • Preserve buffer type when adapting to CuArray. (#1659) (@maleadt)
  • Update manifest (#1661) (@github-actions[bot])
  • Extend conversion of QRPackedQ object to CuArray (#1662) (@GVigne)
  • [CUSPARSE] Add CuSparseMatrixCSC * CuSparseMatrixCSC (#1663) (@amontoison)
  • Update manifest (#1665) (@github-actions[bot])
  • [CUSPARSE] Add more tests (#1668) (@amontoison)
  • Update manifest (#1671) (@github-actions[bot])
  • Update manifest (#1676) (@github-actions[bot])
  • Fix eigen when using Hermitian or Symmetric matrices (#1677) (@GVigne)
  • Update manifest (#1679) (@github-actions[bot])
  • adding defaults for accumulate(op, a) with modified code from Base.accumulate (#1681) (@leios)
  • Add right division operator for Diagonal matrices (#1683) (@GVigne)
  • Update manifest (#1686) (@github-actions[bot])
  • Bump CUQUANTUM libraries (#1688) (@maleadt)
  • typo (#1689) (@ArnoStrouwen)
  • Retry CUSOLVER handle creation when encountering an internal error. (#1691) (@maleadt)
  • Fix #1692 (#1693) (@amontoison)
  • Update manifest (#1694) (@github-actions[bot])
  • [CUSPARSE] Support kron with Diagonal arguments (#1695) (@albertomercurio)
  • Re-introduce memory limits. (#1698) (@maleadt)
  • Adapt to GPUCompiler changes. (#1699) (@maleadt)
  • WMMA: Don't wrap fragments of size 1 in a struct. (#1704) (@maleadt)
  • Update manifest (#1708) (@github-actions[bot])
  • Use plain llvmcall calling convention for WMMA intrinsics. (#1709) (@maleadt)
  • Reclaim in cuDNN conv algorithm search (#1711) (@ToucheSir)
  • CUBLAS: test against generic axp(b)y, not the BLAS-specific one. (#1713) (@maleadt)
  • Fix LU getproperty invoke. (#1714) (@maleadt)
  • Backports for 3.12.1 (#1715) (@maleadt)
  • Specialize cholcopy to avoid scalar indexing. (#1716) (@maleadt)
  • Fix handling of inline-allocated structures with unions. (#1717) (@maleadt)