v3.12.1
CUDA v3.12.1
Closed issues:
- Accumulate doesn't work on >=4 dim Arrays with dims <= ndims(A) - 3 (#1039)
- CUSPARSE does not support dense-sparse matrix multiplication (#1403)
- Scalar indexing when comparing a CuArray to the identity matrix (#1557)
- CUBLAS_STATUS_NOT_INITIALIZED (#1567)
- LinearAlgebra./ and LinearAlgebra.\ breaks CuArray (#1568)
- Window size in grid-stride loop (#1573)
- Matrix multiplication works for primitive and non-primitive custom number types on the CPU, but it fails for primitive custom number types on the GPU. (#1574)
- CuIterator doesn't specify IteratorSize but has no length() (#1583)
- Garbage collection doesn't work as shown in the documentation (#1586)
- Adding sparse adjoint results in kernel error (#1591)
- sparse - sparse matrix multiplication partially missing (#1599)
- FastMath sincos(), cis(), exp(im..) aren't as fast as C++ (#1606)
- wrong type in wrapper of a cusolver function (#1621)
- Adding CUDNN support for 3D convolutions/cross-correlations (#1631)
copyto!
does not work between a CuArray and aview(Array)
(#1634)- Minor issue with sparse function (#1641)
- Scalar indexing when displaying Diagonal{Int64, CuSparseVector{Int64, Int32}} (#1645)
- Many errors running test suite on GTX 960 4GB (#1650)
- Driver discovery broken on platforms without compat driver (#1653)
- Aliasing/Polluted Result from rfftplan for Float32 2^n 3D array (#1656)
- Re-instate memory limit (#1670)
- Split libnvToolsExt from CUDA_Runtime_jll? (#1672)
accumulate(op, a)
causes scalar indexing (#1680)- CUSPARSE CI failures (#1692)
- axpy! for nested base types (reshapedarray/adjoint/view) (#1696)
copyto!
between a PermutedDimsArray view and a CuArray doesn't work (#1697)- WMMA test failure (#1700)
UndefVarError
when a binary is not found (#1701)- Is CUSPARSELT supported? (#1702)
- Best practices to reduce startup time (#1707)
- 1.9 compatibility (#1710)
- WARNING: unused variadic paramters. (#1712)
Merged pull requests:
- Remove/rework CuDeviceArray constructors (#1308) (@maleadt)
- Add
always_inline
kernel parameter (#1554) (@lcw) - Update manifest (#1564) (@github-actions[bot])
- Update manifest (#1569) (@github-actions[bot])
- Update manifest (#1571) (@github-actions[bot])
- Fix native RNG window calculation. (#1575) (@maleadt)
- Use Base.active_project. (#1576) (@maleadt)
- Fixes for and tests using JET. (#1577) (@maleadt)
- Update manifest (#1578) (@github-actions[bot])
- Docs, remove global variables in intro benchmark (#1580) (@SteffenPL)
- Update manifest (#1581) (@github-actions[bot])
- Update manifest (#1582) (@github-actions[bot])
- Bugfixes when using \ operator with non square matrices (#1584) (@GVigne)
- remove unbound type parameters (#1585) (@nsajko)
- added --openacc-profiling off to the nvprof (#1587) (@mbeltagy)
- Update manifest (#1588) (@github-actions[bot])
- Wrap at-cuda's code in a let block. (#1589) (@maleadt)
- Revert: Use JET during test suite. (#1590) (@maleadt)
- [CUSPARSE] Update mv! and mm! functions for CuSparseMatrixCOO and CuSparseMatrixCSC (#1592) (@amontoison)
- [CUSPARSE] Add sv! and sm! routines (#1593) (@amontoison)
- CompatHelper: bump compat for "BFloat16s" to "0.3" (#1594) (@github-actions[bot])
- Update wrap.jl (#1595) (@amontoison)
- Provide more useful explanation why an eltype is unsupported. (#1596) (@maleadt)
- CompatHelper: bump compat for "BFloat16s" to "0.4" (#1597) (@github-actions[bot])
- Improve eltype error reporting. (#1598) (@maleadt)
- Add () at the end of the library name in all ccall (#1600) (@amontoison)
- Define length for CuIterator (#1602) (@mcabbott)
- Added more sparse functions like: kron, tril, triu, reshape, adjoint, transpose, sparse-sparse multiplication (#1603) (@albertomercurio)
- Fix rotate! and reflect! for the generic fallback in GPUArrays.jl (#1604) (@amontoison)
- Update manifest (#1605) (@github-actions[bot])
- Update manifest (#1609) (@github-actions[bot])
- [CUSPARSE] Interface generic routines (#1611) (@amontoison)
- [CUSPARSE] Update sparse-sparse GEMM (#1613) (@amontoison)
- [CUSPARSE] Add sddmm! and gemvi! routines (#1615) (@amontoison)
- Update manifest (#1616) (@github-actions[bot])
- Don't use isbitsunion to support structs of union types. (#1617) (@maleadt)
- Update CUDA driver compatibility package to 11.8. (#1618) (@maleadt)
- Update CUDA artifacts to 11.7 Update 1. (#1619) (@maleadt)
- Update to CUDA 11.8 (#1620) (@maleadt)
- Update to CUDNN 8.6. (#1622) (@maleadt)
- Move CUDNN and CUTENSOR into separate packages (#1624) (@maleadt)
- Bump BFloat16s. (#1625) (@maleadt)
- fix #1621 (#1626) (@jemiryguo)
- Restore functionality of FastMath.sincos. (#1627) (@maleadt)
- Update manifest (#1628) (@github-actions[bot])
- Switch from manual artifact handling to automated JLLs (#1629) (@maleadt)
- [CUSPARSE] Add CuMatrix * CuSparseMatrix products (#1632) (@amontoison)
- Silence some test warnings. (#1635) (@maleadt)
- Update CUTENSOR to v1.6 (#1636) (@maleadt)
- [CUSPARSE] Add SparseMatrix * SparseVector products (#1637) (@amontoison)
- Upgrade CUSTATEVEC to v1.1 (#1638) (@maleadt)
- Upgrade CUTENSORNET to v1.1 (#1639) (@maleadt)
- [CUSPARSE] Add CuSparseVector ± CuSparseVector (#1640) (@amontoison)
- CompatHelper: add new compat entry for "Preferences" at version "1" (#1642) (@github-actions[bot])
- Fix #1641 (#1643) (@amontoison)
- Update manifest (#1646) (@github-actions[bot])
- [CUSPARSE] Add dot(CuSparseVector,CuVector) and vice-versa (#1647) (@amontoison)
- [CUSPARSE] Add ldiv! for CuSparseMatrixCOO and geam for CuSparseMatrixCSC (#1648) (@amontoison)
- Update autogenerated headers (#1649) (@maleadt)
- Remove deprecations (#1651) (@maleadt)
- Don't warn about the old JULIA_CUDA_USE_BINARYBUILDER env var when using preferences (#1652) (@maleadt)
- Update CUTENSORNET to use new slice group (#1654) (@kshyatt)
- [CUSPARSE] Fix conversions between CuSparseMatrixCOO and CuSparseMatrixCSC (#1655) (@amontoison)
- Include compiler options in error log. (#1657) (@maleadt)
- Discover the system driver when CUDA_Driver_jll isn't available. (#1658) (@maleadt)
- Preserve buffer type when adapting to CuArray. (#1659) (@maleadt)
- Update manifest (#1661) (@github-actions[bot])
- Extend conversion of QRPackedQ object to CuArray (#1662) (@GVigne)
- [CUSPARSE] Add CuSparseMatrixCSC * CuSparseMatrixCSC (#1663) (@amontoison)
- Update manifest (#1665) (@github-actions[bot])
- [CUSPARSE] Add more tests (#1668) (@amontoison)
- Update manifest (#1671) (@github-actions[bot])
- Update manifest (#1676) (@github-actions[bot])
- Fix eigen when using Hermitian or Symmetric matrices (#1677) (@GVigne)
- Update manifest (#1679) (@github-actions[bot])
- adding defaults for accumulate(op, a) with modified code from Base.accumulate (#1681) (@leios)
- Add right division operator for Diagonal matrices (#1683) (@GVigne)
- Update manifest (#1686) (@github-actions[bot])
- Bump CUQUANTUM libraries (#1688) (@maleadt)
- typo (#1689) (@ArnoStrouwen)
- Retry CUSOLVER handle creation when encountering an internal error. (#1691) (@maleadt)
- Fix #1692 (#1693) (@amontoison)
- Update manifest (#1694) (@github-actions[bot])
- [CUSPARSE] Support kron with Diagonal arguments (#1695) (@albertomercurio)
- Re-introduce memory limits. (#1698) (@maleadt)
- Adapt to GPUCompiler changes. (#1699) (@maleadt)
- WMMA: Don't wrap fragments of size 1 in a struct. (#1704) (@maleadt)
- Update manifest (#1708) (@github-actions[bot])
- Use plain llvmcall calling convention for WMMA intrinsics. (#1709) (@maleadt)
- Reclaim in cuDNN conv algorithm search (#1711) (@ToucheSir)
- CUBLAS: test against generic axp(b)y, not the BLAS-specific one. (#1713) (@maleadt)
- Fix LU getproperty invoke. (#1714) (@maleadt)
- Backports for 3.12.1 (#1715) (@maleadt)
- Specialize cholcopy to avoid scalar indexing. (#1716) (@maleadt)
- Fix handling of inline-allocated structures with unions. (#1717) (@maleadt)