v4.4
Changes
🐛 Bug Fixes
- compiler: Patch TempFunction pickling @FabioLuporini (#1677)
- bench: Patch ASV's generation of new plots @FabioLuporini (#1672)
- gpu: Fix leaks due to excessive fetching/prefetching @FabioLuporini (#1658)
- types: Drop unsafe/unnecessary memoization in the arguments processing engine @FabioLuporini (#1647)
- compiler: Fix processing of grid spacing @FabioLuporini (#1628)
- misc: Avoid crashing on missing _memfree_args @FabioLuporini (#1612)
- compiler: Fix ScheduleTree construction in presence of guards and/or syncs @FabioLuporini (#1611)
- misc: Patch ThreadID pickling @FabioLuporini (#1606)
- compiler: Patch issue #1592 (HaloScheme with time subdimensions) @Leitevmd (#1597)
- dsl: Patch symbolic coefficients with staggered grids @EdCaunt (#1595)
- Fix find_library issue to MacOS Big Sur @speglich (#1584)
- Fix hierarchical blocking + parallelism @FabioLuporini (#1580)
- BoundSymbol constructor to be cached @mloubout (#1576)
Compiler
- compiler: Tweak nested-par candidate condition @georgebisbas (#1669)
- compiler: Singletonize special symbols (e.g. nthreads) @FabioLuporini (#1650)
- misc: Drop unused backend infrastructure @FabioLuporini (#1632)
- compiler: Improve aliases detection, processing, and optimization @FabioLuporini (#1631)
- mpi: Add diag2 mode @FabioLuporini (#1630)
- compiler: Add skewing pass towards Temporal Blocking @georgebisbas (#1620)
- misc: Use pickled soname rather than generating a new one @FabioLuporini (#1605)
- compiler: Add option to use Functions, in place of Arrays, for compiler-generated temporary @FabioLuporini (#1591)
- compiler: Improve the cost model used by CIRE @FabioLuporini (#1585)
- Refactor Array sharing @FabioLuporini (#1583)
- Refactor Operator hierarchy @FabioLuporini (#1573)
- Introduce devito/arch @FabioLuporini (#1563)
- operator: Use FD-Gpts/s instead of Gpts/s @georgebisbas (#1544)
API
- sympy: Support v1.8 @mloubout (#1549)
- compiler: Check consistency between shape and grid for TimeFunction, too @tjb900 (#1667)
- dsl: Add MatrixSparseTimeFunction to support multi-point sources @FabioLuporini (#1603)
- dsl: Enable overriding over SubDimension thickness @FabioLuporini (#1608)
- Shift argument as a tuple @Leitevmd (#1561)
- BoundSymbol constructor to be cached @mloubout (#1576)
- gpu: Add
devicerm
API for conditional deletions @FabioLuporini (#1571) - Introduce
deviceid
API to offload Operators on specific GPUs @FabioLuporini (#1569)
Examples
- advisor: merge roofline and json @georgebisbas (#1649)
- examples: TTI 1st order operators @ofmla (#1602)
- examples: Add viscoacoustic Born operator to 2nd sls equation @nogueirapeterson (#1617)
- examples: Born approximation for TTI media @ofmla (#1555)
- examples: add first order adjoint viscoacoustic equations @nogueirapeterson (#1567)
- Tutorials: add shift parameter to Ren visco-acoustic equation @nogueirapeterson (#1562)
Documentation
- Bump static release number @FabioLuporini (#1564)
MPI
- compiler: Check consistency between shape and grid for TimeFunction, too @tjb900 (#1667)
- mpi: Add diag2 mode @FabioLuporini (#1630)
- mpi: Add ability to specify MPI topology used for grid division @FabioLuporini (#1604)
- mpi: Prevent double finalization with devito used as a lib @FabioLuporini (#1609)
- mpi: Pass MPI_COMM_SELF to Distributor upon unpickling @FabioLuporini (#1607)
GPU
- gpu: Fixup prefetch jitting when using extra symbols @FabioLuporini (#1678)
- gpu: Update Dockerfile.nvidia to HPC SDK 21.3 @kenhester (#1659)
- gpu: Fix leaks due to excessive fetching/prefetching @FabioLuporini (#1658)
- gpu: Add gpu-fit value for all functions (fix #1642) @Leitevmd (#1645)
- compiler: Work around clang[10,11,?] omp-offloading bug @FabioLuporini (#1634)
- arch: Review get_gpu_info @Leitevmd (#1626)
- misc: Updating to NVIDIA HPC SDK 21.2 @kenhester (#1594)
- compiler: Target gpu for PGI openacc @Leitevmd (#1587)
- arch: Add nvidia-smi parser to GPU checking @Leitevmd (#1615)
- gpu: Updated Dockerfile.nvidia to HPC SDK 21.1 @kenhester (#1593)
- gpu: Add
devicerm
API for conditional deletions @FabioLuporini (#1571) - Data streaming support with OpenMP offloading @FabioLuporini (#1556)
Testing
- ci: Modify mpi-example workflow and update docker actions @rhodrin (#1664)
- ci: Switch to default gcc(9.3) for conda build @mloubout (#1651)
- ci: Fix conda build @mloubout (#1648)
- ci: Update Ci-gpu for nvidia openmp @Leitevmd (#1635)
- ci: Work around archives missing error on apt install @FabioLuporini (#1623)
- ci: Transfer gpu workflow to self-hosted runners @rhodrin (#1618)
- ci: Separate adjoint based tests @rhodrin (#1572)