Merge branch 'develop' into faster_setTwist

QMCPACK · Aug 31, 2023 · 0a56494 · 0a56494
2 parents acbb689 + 7b9d286
commit 0a56494
Show file tree

Hide file tree

Showing 68 changed files with 2,135 additions and 3,453 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,7 +4,78 @@ Notable changes to QMCPACK are documented in this file.
 
 ## [Unreleased]
 
-The legacy CUDA implementation, the version built with QMC_CUDA=1, has been removed from the codebase.
+* Support for backflow optimization has been removed as part of refactoring and cleaning the codebase. QMC runs using backflow
+  wavefunctions are still supported. This feature is expected to eventually be reimplemented in v4. Users needing
+  backflow optimization can use previously released versions of QMCPACK or work towards its reimplementation in the modern code.
+  [#4688](https://github.com/QMCPACK/qmcpack/pull/4688)
+
+## [3.17.1] - 2023-08-25
+
+This minor release is recommended for all users and includes a couple of build fixes and a NEXUS improvement.
+
+* Improved HDF5 detection. Fixes cases where HDF5 was not identified by CMake, including on FreeBSD (thanks @yurivict for the report). [#4708](https://github.com/QMCPACK/qmcpack/pull/4708)
+* Fix for building with BUILD_UNIT_TESTS=OFF. [#4709](https://github.com/QMCPACK/qmcpack/pull/4709)
+* Add timer for orbital rotations. [#4706](https://github.com/QMCPACK/qmcpack/pull/4706)
+
+### NEXUS
+
+* NEXUS: Support for spinor inputs. [#4707](https://github.com/QMCPACK/qmcpack/pull/4707)
+## [3.17.0] - 2023-08-18
+
+This is a recommended release for all users. Thanks to everyone who contributed directly, reported an issue, or suggested an
+improvement. There are many quality of life improvements, bug fixes throughout the application, and updates to the associated
+testing. As previously announced, the legacy CUDA support (QMC_CUDA=1) is removed in this version. For GPU support, users should
+transition to the offload code which is more capable and fully usable in production on NVIDIA GPUs.
+
+This version is intended for long-term support of v3 of QMCPACK. Development effort is now focused towards v4. Contributions of
+tests, fixes, and features from users and developers are still welcome to v3 for a potential future release. However, these will not
+be ported towards v4 by the core QMCPACK developers without prior arrangement. Please discuss options with QMCPACK developers.
+
+* Simplified checkpointing and enabled it in the batched drivers. Users now only need specify checkpoint={-1,0,N} to checkpoint
+  between blocks. [#4646](https://github.com/QMCPACK/qmcpack/pull/4646)
+* NERSC Perlmutter build recipe. [#4698](https://github.com/QMCPACK/qmcpack/pull/4698)
+* qmc-fit: Now supports parameter fitting with jackknife for e.g. DFT+U, EXX scans
+  [#4475](https://github.com/QMCPACK/qmcpack/pull/4475) and for equation of states and morse fits
+  [#4518](https://github.com/QMCPACK/qmcpack/pull/4518)
+* Improved error checking including NaN checks to protect against potentially unreliable compilers and libraries,
+  [#4697](https://github.com/QMCPACK/qmcpack/pull/4697), and checks on GPU matrix inversion
+  [#4693](https://github.com/QMCPACK/qmcpack/pull/4693)
+* Significant advances in orbital optimization capability, focusing on LCAO wavefunctions. Development is ongoing for
+  multideterminant support and for spline wavefunctions. See e.g. the Be atom orbital optimization test
+  [#4626](https://github.com/QMCPACK/qmcpack/pull/4626), [#4619](https://github.com/QMCPACK/qmcpack/pull/4619), reading and writing
+  of orbital rotation parameters [#4580](https://github.com/QMCPACK/qmcpack/pull/4580), support for disabled/frozen parameters
+  [#4581](https://github.com/QMCPACK/qmcpack/pull/4581). 
+* Magnetization Density Estimator for non-collinear wavefunctions [#4531](https://github.com/QMCPACK/qmcpack/pull/4531)
+* Pathak-Wagner regularizer for forces [#4477](https://github.com/QMCPACK/qmcpack/pull/4477)
+* The legacy CUDA implementation, the version built with QMC_CUDA=1, has been removed from the codebase,
+  [#4431](https://github.com/QMCPACK/qmcpack/pull/4431),
+  [#4632](https://github.com/QMCPACK/qmcpack/pull/4632),[#4499](https://github.com/QMCPACK/qmcpack/pull/4499),
+  [#4442](https://github.com/QMCPACK/qmcpack/pull/4442).
+* For increased performance with current AMD GPU support, new QMC_DISABLE_HIP_HOST_REGISTER option is enabled by default for
+  ROCm/HIP builds. [#4674](https://github.com/QMCPACK/qmcpack/pull/4674)
+* Bugfix: J1Spin indexing was wrong [#4612](https://github.com/QMCPACK/qmcpack/pull/4612)
+* Bugfix: 1RDM estimator data written to stat.h5 was incorrect [#4568](https://github.com/QMCPACK/qmcpack/pull/4568)
+* Introduced ENABLE_PPCONVERT option and skip ppconvert compilation when cross compiling. [#4601](https://github.com/QMCPACK/qmcpack/pull/4601)
+* Faster builds compared to v3.16.0 due to code refactoring [#4682](https://github.com/QMCPACK/qmcpack/pull/4682)
+* Many refinements throughout the codebase, cleanup, improved testing.
+
+### NEXUS
+
+* Nexus: Equilibration detection algorithm is now deterministic [#4557](https://github.com/QMCPACK/qmcpack/pull/4557)
+* Nexus: Support for Kagayaki cluster at JAIST [#4598](https://github.com/QMCPACK/qmcpack/pull/4598)
+* Nexus: GPU support fix for NERSC/Perlmutter [#4699](https://github.com/QMCPACK/qmcpack/pull/4699)
+* Nexus: Use simplices in convex_hull to support newer scipy versions [#4671](https://github.com/QMCPACK/qmcpack/pull/4671)
+* Nexus: Add pdos flag for Projwfc [#4655](https://github.com/QMCPACK/qmcpack/pull/4655)
+* Nexus: Adding crowds_serialize_walkers tag to dmc input list [#4651](https://github.com/QMCPACK/qmcpack/pull/4651)
+* Nexus: Qdens handles batched driver input/output [#4645](https://github.com/QMCPACK/qmcpack/pull/4645)
+* Nexus: Fix namelist read for Projwfc input [#4644](https://github.com/QMCPACK/qmcpack/pull/4644)
+
+### Known problems
+
+* When offload builds are compiled with CUDA toolkit versions above 11.2 using LLVM, multideterminant tests and functionality will
+  fail, seemingly due to an issue with the toolkit. This is discussed in https://github.com/llvm/llvm-project/issues/54633 . All
+  other functionality appears to work as expected. As a workaround, the CUDA toolkit 11.2 can be used. The actual NVIDIA drivers can
+  be more recent.
 
 ## [3.16.0] - 2023-01-31
 

diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -15,7 +15,7 @@ endif()
 ######################################################################
 project(
   qmcpack
-  VERSION 3.16.9
+  VERSION 3.17.9
   LANGUAGES C CXX)
 
 # add the automatically determined parts of the RPATH
@@ -658,9 +658,15 @@ else()
   set(HDF5_USE_STATIC_LIBRARIES off)
 endif()
 
-find_package(HDF5 1.10 COMPONENTS C)
+find_package(HDF5 COMPONENTS C) # Note: minimum version check is done below to bypass find_package
+                                # and HDF5 version compatibility subtleties
 
 if(HDF5_FOUND)
+  if(HDF5_VERSION)
+    if (HDF5_VERSION VERSION_LESS 1.10.0)
+      message(FATAL_ERROR "QMCPACK requires HDF5 version >= 1.10.0")
+    endif()
+  endif(HDF5_VERSION)
   if(HDF5_IS_PARALLEL)
     if(HAVE_MPI)
       message(STATUS "Parallel HDF5 library found")

diff --git a/README.md b/README.md
@@ -16,9 +16,20 @@ particular emphasis is placed on code quality and reproducibility.
 
 # Obtaining and installing QMCPACK
 
- Obtain the latest release from https://github.com/QMCPACK/qmcpack/releases or clone the development source from
- https://github.com/QMCPACK/qmcpack. A full installation guide and steps to perform an initial QMC calculation are given in the
- [extensive online documentation for QMCPACK](https://qmcpack.readthedocs.io/en/develop/index.html).
+Obtain the latest release from https://github.com/QMCPACK/qmcpack/releases or clone the development source from
+https://github.com/QMCPACK/qmcpack. A full installation guide and steps to perform an initial QMC calculation are given in the
+[extensive online documentation for QMCPACK](https://qmcpack.readthedocs.io/en/develop/index.html).
+
+The [CHANGELOG.md](CHANGELOG.md) describes key changes made in each release as well as any major changes to the development version.
+
+# Documentation and support
+
+For more information, consult QMCPACK pages at http://www.qmcpack.org, the manual at
+https://qmcpack.readthedocs.io/en/develop/index.html, or its sources in the docs directory.
+
+If you have trouble using or building QMCPACK, or have questions about its use, please post to the [Google QMCPACK
+group](https://groups.google.com/forum/#!forum/qmcpack), create a GitHub issue at https://github.com/QMCPACK/qmcpack/issues or
+contact a developer.
 
 # Prerequisites
 
@@ -36,39 +47,35 @@ particular emphasis is placed on code quality and reproducibility.
 We aim to support open source compilers and libraries released within two years of each QMCPACK release. Use of software versions
 over two years old may work but is discouraged and untested. Proprietary compilers (Intel, NVHPC) are generally supported over the
 same period but may require use of an exact version. We also aim to support the standard software environments on machines such as
-Summit at OLCF, Theta at ALCF, and Cori at NERSC. Use of the most recently released compilers and library versions is particularly
-encouraged for highest performance and easiest configuration.
+Frontier and Summit at OLCF, Aurora and Polaris at ALCF, and Perlmutter at NERSC. Use of the most recently released compilers and
+library versions is particularly encouraged for highest performance and easiest configuration.
 
-Nightly testing currently includes the following software versions on x86:
+Nightly testing currently includes at least the following software versions:
 
 * Compilers
-  * GCC 11.2.0, 9.2.0
-  * Clang/LLVM 13.0.0
-  * Intel 19.1.1.217 configured to use C++ library from GCC 9.1.0 
-  * NVIDIA HPC SDK 21.5 configured to use C++ library from GCC 9.1.0
-* Boost 1.77.0, 1.68.0
-* HDF5 1.12.1
+  * GCC 13.2.0, 11.4.0
+  * Clang/LLVM 16.0.6
+* Boost 1.83.0, 1.77.0
+* HDF5 1.14.2
 * FFTW 3.3.10, 3.3.8
-* CMake 3.21.1, 3.15.0
+* CMake 3.27.4, 3.21.3
 * MPI
-  * OpenMPI 4.1.1, 3.1.6
-  * Intel MPI 19.1.1.217
-* CUDA 11.4
+  * OpenMPI 4.1.5
+* CUDA 11.2
 
-Workflow tests are performed with Quantum Espresso v6.8.0 and PySCF v1.7.5. These check trial wavefunction generation and
-conversion through to actual QMC runs.
+GitHub Actions-based tests include additional version combinations from within our two year support window. On a developmental basis
+we also check the latest Clang and GCC development versions, AMD Clang and Intel OneAPI compilers. 
 
-On a developmental basis we also check the latest Clang and GCC development versions, AMD AOMP and Intel OneAPI compilers.
+Workflow tests are currently performed with Quantum Espresso v7.2.0 and PySCF v2.2.0. These check trial wavefunction generation and
+conversion through to actual QMC runs.
 
 # Building with CMake
 
- The build system for QMCPACK is based on CMake.  It will auto-configure based on the detected compilers and libraries. Previously
- QMCPACK made extensive use of toolchains, but the system has since been updated to eliminate the use of toolchain files for most
- cases.  Specific compile options can be specified either through specific environment or CMake variables.  When the libraries are
- installed in standard locations, e.g., /usr, /usr/local, there is no need to set environment or CMake variables for the packages.
+The build system for QMCPACK is based on CMake.  It will auto-configure based on the detected compilers and libraries. When these 
+are installed in standard locations, e.g., /usr, /usr/local, there is no need to set either environment or CMake variables.
 
- See the manual linked at https://qmcpack.readthedocs.io/en/develop/ and https://www.qmcpack.org/documentation or buildable using
- sphinx from the sources in docs/. A PDF version is still available at https://qmcpack.readthedocs.io/_/downloads/en/develop/pdf/
+See the manual linked at https://qmcpack.readthedocs.io/en/develop/ and https://www.qmcpack.org/documentation or buildable using
+sphinx from the sources in docs/. A PDF version is still available at https://qmcpack.readthedocs.io/_/downloads/en/develop/pdf/
 
 ## Quick build
 
@@ -100,10 +107,10 @@ cmake ..
 make -j 8
 ```
 
- The complexities of modern computer hardware and software systems are
- such that you should check that the auto-configuration system has made
- good choices and picked optimized libraries and compiler settings
- before doing significant production. i.e. Check the details below.
+The complexities of modern computer hardware and software systems are
+such that you should check that the auto-configuration system has made
+good choices and picked optimized libraries and compiler settings
+before doing significant production. i.e. Check the details below.
 
 ## Set the environment
 
@@ -336,14 +343,6 @@ Individual tests can be run by specifying their name
 ctest -R name-of-test-to-run
 ```
 
-# Documentation and support
-
-For more information, consult QMCPACK pages at http://www.qmcpack.org, the manual at
-https://qmcpack.readthedocs.io/en/develop/index.html, or its sources in the docs directory.
-
-If you have trouble using or building QMCPACK, or have questions about its use, please post to the [Google QMCPACK
-group](https://groups.google.com/forum/#!forum/qmcpack), create a GitHub issue at https://github.com/QMCPACK/qmcpack/issues or
-contact a developer.
 
 # Contributing
 

diff --git a/config/build_alcf_polaris_Clang.sh b/config/build_alcf_polaris_Clang.sh
@@ -57,12 +57,16 @@ if [[ $name == *"_MP"* ]]; then
   CMAKE_FLAGS="$CMAKE_FLAGS -DQMC_MIXED_PRECISION=ON"
 fi
 
+if [[ $name == *"offload"* || $name == *"cuda"* ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DQMC_GPU_ARCHS=sm_80"
+fi
+
 if [[ $name == *"offload"* ]]; then
-  CMAKE_FLAGS="$CMAKE_FLAGS -DENABLE_OFFLOAD=ON -DUSE_OBJECT_TARGET=ON -DOFFLOAD_ARCH=sm_80"
+  CMAKE_FLAGS="$CMAKE_FLAGS -DENABLE_OFFLOAD=ON"
 fi
 
 if [[ $name == *"cuda"* ]]; then
-  CMAKE_FLAGS="$CMAKE_FLAGS -DENABLE_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=80"
+  CMAKE_FLAGS="$CMAKE_FLAGS -DENABLE_CUDA=ON"
 fi
 
 folder=build_${Machine}_${Compiler}_${name}

diff --git a/config/build_nersc_perlmutter_Clang.sh b/config/build_nersc_perlmutter_Clang.sh
@@ -0,0 +1,99 @@
+#!/bin/bash
+# This recipe is intended for NERSC Perlmutter https://docs.nersc.gov/systems/perlmutter
+# It builds all the varaints of QMCPACK in the current directory
+# last revision: Aug 12th 2023
+#
+# How to invoke this script?
+# build_nersc_perlmutter_Clang.sh # build all the variants assuming the current directory is the source directory.
+# build_nersc_perlmutter_Clang.sh <source_dir> # build all the variants with a given source directory <source_dir>
+# build_nersc_perlmutter_Clang.sh <source_dir> <install_dir> # build all the variants with a given source directory <source_dir> and install to <install_dir>
+
+module load PrgEnv-gnu
+module load cray-libsci
+CRAY_LIBSCI_LIB=$CRAY_LIBSCI_PREFIX_DIR/lib/libsci_gnu_mp.so
+
+module load PrgEnv-llvm/0.1 llvm/16
+module load cray-fftw/3.3.10.3
+module load cray-hdf5-parallel/1.12.2.3
+module load cmake/3.24.3
+
+
+echo "**********************************"
+echo '$ clang -v'
+clang -v
+echo "**********************************"
+
+TYPE=Release
+Machine=perlmutter
+Compiler=Clang16
+
+if [[ $# -eq 0 ]]; then
+  source_folder=`pwd`
+elif [[ $# -eq 1 ]]; then
+  source_folder=$1
+else
+  source_folder=$1
+  install_folder=$2
+fi
+
+if [[ -f $source_folder/CMakeLists.txt ]]; then
+  echo Using QMCPACK source directory $source_folder
+else
+  echo "Source directory $source_folder doesn't contain CMakeLists.txt. Pass QMCPACK source directory as the first argument."
+  exit
+fi
+
+for name in offload_cuda_real_MP offload_cuda_real offload_cuda_cplx_MP offload_cuda_cplx \
+            cpu_real_MP cpu_real cpu_cplx_MP cpu_cplx
+do
+
+CMAKE_FLAGS="-DCMAKE_BUILD_TYPE=$TYPE -DBLAS_LIBRARIES=$CRAY_LIBSCI_LIB"
+
+if [[ $name == *"cplx"* ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DQMC_COMPLEX=ON"
+fi
+
+if [[ $name == *"_MP"* ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DQMC_MIXED_PRECISION=ON"
+fi
+
+if [[ $name == *"offload"* || $name == *"cuda"* ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DQMC_GPU_ARCHS=sm_80"
+fi
+
+if [[ $name == *"offload"* ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DENABLE_OFFLOAD=ON"
+fi
+
+if [[ $name == *"cuda"* ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DENABLE_CUDA=ON"
+fi
+
+folder=build_${Machine}_${Compiler}_${name}
+
+if [[ -v install_folder ]]; then
+  CMAKE_FLAGS="$CMAKE_FLAGS -DCMAKE_INSTALL_PREFIX=$install_folder/$folder"
+fi
+
+echo "**********************************"
+echo "$folder"
+echo "$CMAKE_FLAGS"
+echo "**********************************"
+
+mkdir $folder
+cd $folder
+
+if [ ! -f CMakeCache.txt ] ; then
+cmake $CMAKE_FLAGS -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx $source_folder
+fi
+
+if [[ -v install_folder ]]; then
+  make -j16 install && chmod -R -w $install_folder/$folder
+else
+  make -j16
+fi
+
+cd ..
+
+echo
+done
diff --git a/config/docker/dependencies/ubuntu22/openmpi/Dockerfile b/config/docker/dependencies/ubuntu22/openmpi/Dockerfile
@@ -11,11 +11,11 @@ RUN wget https://apt.kitware.com/kitware-archive.sh &&\
     sh kitware-archive.sh
 
 RUN export DEBIAN_FRONTEND=noninteractive &&\
-    apt-get install gcc g++ \ 
-    clang \
-    clang-format \
-    clang-tidy \
-    libomp-dev \
+    apt-get install gcc-9 g++-9 \ 
+    clang-14 \
+    clang-format-14 \
+    clang-tidy-14 \
+    libomp-14-dev \
     gcovr \
     python3 \
     cmake \
@@ -49,6 +49,18 @@ RUN export DEBIAN_FRONTEND=noninteractive &&\
 RUN export DEBIAN_FRONTEND=noninteractive &&\
     pip3 install cif2cell
 
+RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 100 && \
+    update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-9 100
+
+# add clang-14 as clang
+RUN update-alternatives --install /usr/bin/clang clang /usr/bin/clang-14 100 && \
+    update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-14 100
+
+# add clang-format and clang-tidy as well as libomp
+RUN update-alternatives --install /usr/bin/clang-format clang-format /usr/bin/clang-format-14 100 && \
+    update-alternatives --install /usr/bin/clang-tidy clang-tidy /usr/bin/clang-tidy-14 100 && \
+    update-alternatives --install /usr/bin/clang-tidy-diff.py clang-tidy-diff.py /usr/bin/clang-tidy-diff-14.py 100
+
 # must add a user different from root 
 # to run MPI executables
 RUN useradd -ms /bin/bash user

diff --git a/nexus/lib/machines.py b/nexus/lib/machines.py
@@ -2314,7 +2314,7 @@ def write_job_header(self,job):
 echo $SLURM_SUBMIT_DIR
 cd $SLURM_SUBMIT_DIR
 '''
-        if job.threads>1:
+        if (job.threads>1) and ('cpu' in job.constraint):
             c+='''
 export OMP_PROC_BIND=true
 export OMP_PLACES=threads