Releases: eth-cscs/COSTA
Releases · eth-cscs/COSTA
v2.2.2: Merge pull request #20 from eth-cscs/cmake_fix
- remove the -mtune=native option from the CXXFLAGS
COSTA-v2.2.1
- fix a cmake bug when linking to cray libsci.
v2.2
- Update of cmake build system
- All optional sub-modules are treated as dependency
COSTA-v2.1
This version brings the following improvements/features:
- scalapack-wrappers: COSTA now implements all scalapack
pxgemr2d
,pxtran
(transpose),pxtranu
(complex transpose) andpxtranc
(conjugate, complex transpose) routines, i.e.pdgemr2d
,psgemr2d
,pcgemr2d
,pzgemr2d
pstran
,pdtran
,pctranu
,pztranu
,pctranc
,pztranc
and their prefixed versions:costa_pdgemr2d
,costa_psgemr2d
,costa_pcgemr2d
,costa_pzgemr2d
costa_pstran
,costa_pdtran
,costa_pctranu
,costa_pztranu
,costa_pctranc
,costa_pztranc
- code refactoring
- performance improvements
COSTA-v2.0
This release brings the following improvements:
- [row-major blocks]: COSTA now supports both row-major and col-major ordering of blocks. This is more general than scalapack which supports only the col-major ordering.
- [performance-improvements]: the multithreaded backends are improved to avoid cores oversubscription, resulting in a performance boost.
- [bugfixes]: a thourough testing has been performed during the integration of COSTA into COSMA and CP2K.
COSTA v1.0
This is the very first release of COSTA, bringing the following features:
- scalapack wrappers: for redistribute (
pxgemr2d
) and transpose (pxtran(u)
). - different layouts support: added representation for block-cyclic and arbitrary matrix layouts.
- multiple layouts: can transform multiple layouts at once, i.e. in the same communication round.
- comm-optimal: can minimize the communication volume.
- scaling & transpose: in addition to redistributing the matrix, can also scale initial and final layouts and also transpose them.
- highly optimized: optimized for distributed and multithreaded settings.