Skip to content

Commit

Permalink
Merge 4.1 release branch into amd-fftw
Browse files Browse the repository at this point in the history
  • Loading branch information
BiplabRaut committed Aug 7, 2023
2 parents 891adc6 + 02fdf40 commit d18f5f3
Show file tree
Hide file tree
Showing 9 changed files with 5,101 additions and 4,379 deletions.
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ SET(AMD_ARCH "znver1" CACHE STRING "select AMD zen version for Clang toolchain")

if (CMAKE_C_COMPILER_ID MATCHES Clang)
if ("${AMD_ARCH}" STREQUAL "")
message(FATAL_ERROR "Machine arch missing! Select one of znver1, znver2 or znver3")
message(FATAL_ERROR "Machine arch missing! Select one of znver1, znver2, znver3 or znver4")
elseif (${AMD_ARCH} STREQUAL "znver1")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=znver1")
elseif (${AMD_ARCH} STREQUAL "znver2")
Expand Down Expand Up @@ -252,7 +252,7 @@ if (MSVC)
endif(MSVC)

string(TIMESTAMP TODAY "%Y%m%d")
add_compile_definitions(AOCL_FFTW_VERSION="AOCL-FFTW 4.0 Build ${TODAY}")
add_compile_definitions(AOCL_FFTW_VERSION="AOCL-FFTW 4.1.0 Build ${TODAY}")

find_library (LIBM_LIBRARY NAMES m)
if (LIBM_LIBRARY)
Expand Down
2 changes: 1 addition & 1 deletion COPYRIGHT
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* Copyright (c) 2003, 2007-14 Matteo Frigo
* Copyright (c) 2003, 2007-14 Massachusetts Institute of Technology
* Copyright (C) 2019-2022, Advanced Micro Devices, Inc. All Rights Reserved.
* Copyright (C) 2019-2023, Advanced Micro Devices, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
Expand Down
26 changes: 13 additions & 13 deletions README_AMD.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,16 @@ AMD EPYC CPUs. It is developed on top of FFTW (version fftw-3.3.10).
All known features and functionalities of FFTW are retained and supported
as it is with this AMD optimized FFTW library.

AOCL-FFTW achieves higher performance than the original FFTW 3.3.10 due to its
various optimizations involving improved SIMD Kernel functions, improved copy
functions (cpy2d and cpy2d_pair used in rank-0 transform and buffering plan),
AOCL-FFTW achieves high performance as a result of its various optimizations
involving improved SIMD Kernel functions, improved copy functions
(cpy2d and cpy2d_pair used in rank-0 transform and buffering plan),
improved 256-bit kernels selection by Planner and an optional in-place
transpose for large problem sizes. AOCL-FFTW improves the performance
of in-place MPI FFTs over FFTW 3.3.10 by employing a faster in-place MPI
transpose function. AOCL-FFTW provides a new fast planner mode as an
extension to the original planner that improves planning time of various
planning modes in general and PATIENT mode in particular. Another new planning
mode called Top N planner is also available that minimizes single-threaded
of in-place MPI FFTs by employing a faster in-place MPI transpose function.
AOCL-FFTW provides a new fast planner mode as an extension to the original
planner that improves planning time of various planning modes in general
and PATIENT mode in particular. Another new planning mode called
Top N planner is also available that minimizes single-threaded
run-to-run variations. AOCL-FFTW has a feature called AMD's application
optimization layer that speeds up HPC and scientific applications. AOCL-FFTW
implements the dynamic dispatcher feature that can build a single portable
Expand Down Expand Up @@ -45,7 +45,8 @@ generation architectures.

./configure --enable-sse2 --enable-avx --enable-avx2 --enable-avx512
--enable-mpi --enable-openmp --enable-shared
--enable-amd-opt --enable-amd-mpifft
--enable-amd-opt --enable-amd-mpifft
--enable-dynamic-dispatcher
--prefix=<your-install-dir>
make
make install
Expand Down Expand Up @@ -85,7 +86,7 @@ problem types, Quad or Long double precisions, and split array format.

Dynamic dispatcher achieves Function Multi-versioning by using compiler's
attributes. Use "--enable-dynamic-dispatcher" configure option to enable this
feature. It is supported for GCC compiler and Linux based systems for now.
feature. It is supported for Linux based systems for now.
The set of x86 CPUs on which the single portable library can work depends upon
the highest level of CPU SIMD instruction set with which it is configured.

Expand All @@ -101,9 +102,8 @@ CONTACTS
--------

AOCL-FFTW is developed and maintained by AMD.
You can contact us on the email-id [email protected].
You can also raise any issue/suggestion on the git-hub repository at
https://github.com/amd/amd-fftw/issues
For support of these libraries and the other tools of AMD Zen Software Studio,
see https://www.amd.com/en/developer/aocc/compiler-technical-support.html

ACKNOWLEDGEMENTS
----------------
Expand Down
Loading

0 comments on commit d18f5f3

Please sign in to comment.