v4.6
Changes
API
Examples
- examples: enforce stable space order for self adjoint op @mloubout (#1747)
- tests: add tti_setup to gradientJ test @ofmla (#1740)
Compiler
- compiler: Add machinery for custom memory allocators and MPI @FabioLuporini (#1764)
- compiler: lift skewing in higher block levels @georgebisbas (#1735)
- compiler: Loop fission @FabioLuporini (#1732)
- compiler: improve HB generated code @georgebisbas (#1731)
- compiler: Introduce linearization pass @FabioLuporini (#1727)
- compiler: Introducing min/max bounds to replace 'bf' elemental functions @georgebisbas (#1673)
MPI
- mpi: Speedup index_glb_to_loc @FabioLuporini (#1748)
- mpi: Mitigate SparseFunction setup costs @mloubout (#1720)
GPU
- compiler: Add optimization option to fuse WithLocks tasks @FabioLuporini (#1736)
🐛 Bug Fixes
- mpi: Patch neighborhood construction @FabioLuporini (#1768)
- compiler: Patch SubDomainSet with NVC @FabioLuporini (#1767)
- compiler: Patch and improve SubDomainSet @FabioLuporini (#1762)
- compiler: Fix zero to zero slices @rhodrin (#1757)
- bench: Patch jacobian operators + MPI (see issue #1744) @FabioLuporini (#1745)
Benchmarking
- bench: Patch jacobian operators + MPI (see issue #1744) @FabioLuporini (#1745)
- bench: Add warmup option to run mode @FabioLuporini (#1742)
Continuous Integration
Installation
- install: Udpate to HPCSDK 21.7, Update to Jupyter>=3.0 @FabioLuporini (#1760)
- pip prod(deps): update distributed requirement from <2021.9 to <2021.10 @dependabot (#1749)
- pip prod(deps): update distributed requirement from <2021.8 to <2021.9 @dependabot (#1737)
- reqs: enforce pip>=21.1.2 for conda env installation @georgebisbas (#1734)
- reqs: pip new file arg format @georgebisbas (#1733)
Misc
- misc: git ignore *.npy files @georgebisbas (#1729)