A C++ library implements the Kernel Aggregated Fast Multipole Method based on the library PVFMM.
It computes the classic kernel sum problem: for a given set of single layer sources
Note For some problems the kernels
This package computes Laplace kernel
Here is a detailed table, in which the summation
In the table:
- NA means input ignored
- Q{ij}, D{ij}** are 3x3 tensors written as 9-dimension vectors in row-major format
-
$\nabla\nabla p$ is symmetric so it is written as$p_{,xx},p_{,xy},p_{,xz},p_{,yy},p_{,yz},p_{,zz}$ . - For
RPY
,StokesRegVel
andStokesRegVelOmega
kernels, the parameter$b$ and$\epsilon$ can be different for each source point, and the summations are nonlinear functions of$b$ and$\epsilon$ . Also$b$ and$\epsilon$ must be much smaller than the lower level leaf box of the adaptive octree, otherwise the convergence property of KIFMM is invalidated. - For all kernels, the electrostatic conductivity and fluid viscosity are ignored (set to 1).
- The regularized Stokeslet is
$G_{ij}^\epsilon = \dfrac{1}{8\pi}\dfrac{r^{2}+2 \epsilon^{2}}{\left(r^{2}+\epsilon^{2}\right)^{3 / 2}} \delta_{i j} f_j+\dfrac{1}{\left(r^{2}+\epsilon^{2}\right)^{3 / 2}} r_ir_jf_j$ . - For Stokes
PVel
,PVelGrad
,PVelLaplacian
, andTraction
kernels, the pressure and velocity fields are: $$ p=\frac{1}{4 \pi} \frac{r_{j}}{r^{3}} f_{j} + \frac{1}{4 \pi}\left(-3 \frac{r_{j} r_{k}}{r^{5}}+\frac{\delta_{j k}}{r^{3}}\right) D_{j k}, \quad u_{i}=G_{ij}f_j + \frac{1}{8 \pi \mu}\left(-\frac{r_{i}}{r^{3}} trD\right) + \frac{1}{8 \pi \mu}\left[-\frac{3 r_{i} r_{j} r_{k}}{r^{5}}\right] D_{j k} $$
Kernel | Single Layer Source (dim) | Double Layer Source (dim) | Summation | Target Value (dim) |
---|---|---|---|---|
LapPGrad |
|
|
|
|
LapPGradGrad |
|
|
|
|
LapQPGradGrad |
|
NA |
|
|
Stokes |
|
NA |
|
|
RPY |
|
NA |
|
|
StokesRegVel |
|
NA | ||
StokesRegVelOmega |
|
NA | See Appendix A of doi 10.1016/j.jcp.2012.12.026 |
|
PVel |
|
|
see above |
|
PVelGrad |
|
|
see above |
|
PVelLapLacian |
|
|
see above |
|
Traction |
|
|
see above |
|
- All kernels are hand-written with optimized SIMD intrinsic instructions.
- Singly, doubly and triply periodicity in a unified interface.
- Support no-slip boundary condition imposed on a flat wall through image method.
- Single Layer and Double Layer potentials are simultaneously calculated through a single octree.
- M2M, M2L, L2L operations are combined into single layer operations only.
- All PVFMM data structures are wrapped in a single class.
- Multiple kernels can be activated simultaneously.
- Complete MPI and OpenMP support.
This library defines an abstract base class STKFMM
for the common interface and utility functions. Two concrete derived classes Stk3DFMM
and StkWallFMM
are defined for two separate cases: 3D spatial FMM and Stokes FMM with no-slip boundary condition imposed on a flat wall.
For details of usage, look at the function runFMM()
in Test/Test.cpp
.
Instructions here.
PAXIS paxis = PAXIS::NONE; // or other bc
int k = asInteger(KERNEL::Stokes) | asInteger(KERNEL::RPY); // bitwise | operator, other combinations also work
Construct an STKFMM object, with chosen BC and kernels, depending on if you need the no-slip wall.
std::shared_ptr<STKFMM> fmmPtr;
if (wall) {
fmmPtr = std::make_shared<StkWallFMM>(p, maxPoints, paxis, k);
} else {
fmmPtr = std::make_shared<Stk3DFMM>(p, maxPoints, paxis, k);
}
-
order
: number of equivalent points on each cubic octree box edge of KIFMM, usually chosen from$8,10,12$ . This affects the trade of between accuracy and computation time. -
maxPts
: max number of points in an octree leaf box, usually$500\sim2000$ . This affects the depth of adaptive octree, thus the computation time. -
PAXIS::NONE
: the axis of periodic BC. For periodic boundary conditions, replaceNONE
withPX
,PXY
, orPXYZ
. -
KERNEL::PVel | KERNEL::LAPPGrad
: A combination of supported kernels, using the |bitwise or
operator.
double origin[3] = {x0, y0, z0};
fmmPtr->setBox(origin, box);
- if both SL and DL points exist:
fmmPtr->setPoints(nSL, point.srcLocalSL.data(), nTrg, point.trgLocal.data(), nDL, point.srcLocalDL.data());
- if no DL points:
fmmPtr->setPoints(nSL, point.srcLocalSL.data(), nTrg, point.trgLocal.data());
- For
Stk3DFMM
, all points must in the cube defined by [x0,x0+box)$\times$[y0,y0+box)$\times$[z0,z0+box) - For
StkWallFMM
, all points must in the half cube defined by [x0,x0+box)$\times$[y0,y0+box)$\times$[z0,z0+box/2), and the no-slip boundary condition is always imposed at the z0 plane.
fmmPtr->setupTree(KERNEL::Stokes);
fmmPtr->evaluateFMM(kernel, nSL, value.srcLocalSL.data(), nTrg, trgLocal.data(), nDL, value.srcLocalDL.data());
nDL
and the values for DL sources will be ignored if the chosen kernel does not support DL.
In these tables
-
SL Neutral
means the summation of each component of SL sources within the box must be zero -
$trD$ Neutral means the summation of$trD$ within the box must be zero -
$D_{jj}$ Neutral means the summation of trace of DL sources$D_{jk}$ within the box must be zero -
Yes
means no requirements
Kernel | PNONE |
PX |
PXY |
PXYZ |
---|---|---|---|---|
LapPGrad |
Yes | SL Neutral | SL Neutral | SL Neutral |
LapPGradGrad |
Yes | SL Neutral | SL Neutral | SL Neutral |
LapQPGradGrad |
Yes | SL Neutral | SL Neutral | SL Neutral |
Stokes |
Yes | SL Neutral | SL Neutral | Yes |
RPY |
Yes | SL Neutral | SL Neutral | Yes |
StokesRegVel |
Yes | SL Neutral | SL Neutral | Yes |
StokesRegVelOmega |
Yes | SL Neutral | SL Neutral | Yes Neutral |
PVel |
Yes | SL, |
SL, |
|
PVelGrad |
Yes | SL, |
SL, |
|
PVelLapLacian |
Yes | SL, |
SL, |
|
Traction |
Yes | SL, |
SL, |
|
Kernel | PNONE |
PX |
PXY |
PXYZ |
---|---|---|---|---|
Stokes |
Yes | Yes | Yes | No |
RPY |
Yes | Yes | Yes | No |
- Install the
develop
branch (b9de1a) ofpvfmm
by cmake. If you installpvfmm
by gnu automake you will have to manually helpSTKFMM
discoverpvfmm
.
If PVFMM is properly installed, you should be able to compile this project using the CMakeLists.txt
. The script do-cmake.sh
is an example of how to invoke cmake
command with optional features (python interface and doxygen documentation).
To run the test driver, go to the build folder and type:
./Test/TestFMM.X --help
Test Driver for Stk3DFMM and StkWallFMM
Usage: ./Test/TestFMM.X [OPTIONS]
Options:
-h,--help Print this help message and exit
--config config file name
-S,--nsl INT number of source SL points
-D,--ndl INT number of source DL points
-T,--ntrg INT number of source TRG points
-B,--box FLOAT testing cubic box edge length
-O,--origin [FLOAT,FLOAT,FLOAT]
testing cubic box origin point
-K,--kernel INT test which kernels
-P,--pbc INT periodic boundary condition. 0=none, 1=PX, 2=PXY, 3=PXYZ
-M,--maxOrder INT max KIFMM order, must be even number. Default 16.
--eps FLOAT epsilon or a for Regularized and RPY kernels
--max INT max number of points in an octree leaf box
--seed INT seed for random number generator
--dist [FLOAT,FLOAT] parameters for the random distribution
--type INT type of random distribution, Uniform = 1, LogNormal = 2, Gaussian = 3, Ellipse = 4
--direct,--no-direct{false} run O(N^2) direct summation with S2T kernels
--verify,--no-verify{false} verify results with O(N^2) direct summation
--convergence,--no-convergence{false}
calculate convergence error relative to FMM at maxOrder
--random,--no-random{false} use random points, otherwise regular mesh
--dump,--no-dump{false} write src/trg coord and values to files
--wall,--no-wall{false} test StkWallFMM, otherwise Stk3DFMM
For possible test options. Several test configuration files are included in the folder Config
, and can be loaded by TestFMM.X
as this:
./Test/TestFMM.X --config ../Config/Verify.toml
For large scale convergence tests of all possible BCs (roughly ~100GB of memory will be used and a lot of precomputed data will be generated for the first run):
./Test/TestFMM.X --config ../Config/BenchP0.toml
The options in the config toml file can be overridden by extra flags, for example, use other boundary conditions:
./Test/TestFMM.X --config ../Config/BenchP0.toml -P 1
./Test/TestFMM.X --config ../Config/BenchP0.toml -P 2
./Test/TestFMM.X --config ../Config/BenchP0.toml -P 3
TestFMM.X
will write a TestLog.json
file, which can be loaded into python for convenient performance/accuracy analysis and plotting.
Note If your machine's memory is limited (<24GB), use smaller number of points and test one kernel at a time.
STKFMM
has a few optional features that can be turned on or off during the cmake configuration stage with the following switches:
-D BUILD_TEST=ON \
-D BUILD_DOC=OFF \
-D BUILD_M2L=OFF \
-D PyInterface=OFF \
By default, only the BUILD_TEST
is turned on.
- If you need doxygen document, set
BUILD_DOC=ON
. - If you want to generate periodicity precomputed
M2L
data yourself, setBUILD_M2L=ON
. In this case you will have to install the linear algebra libraryEigen
. If you do not want to generate periodicity precomputed data yourself, you can download theM2C.7z
file fromhttps://zenodo.org/record/6338525#.YijCaXrMJD8
and unzip all data files to folder$PVFMM_DIR/pdata
. - If you want to call this library from python, set
PyInterface=ON
. In this case you need some basic python facilities. Here is a basic example forrequirements.txt
used for python virtualenv:
argh==0.26.2
h5py==2.9.0
llvmlite==0.32.1
mpi4py==3.0.2
numba==0.49.1
numpy==1.18.4
scipy==1.4.1
six==1.15.0
Dhairya Malhotra and Alex Barnett for useful coding instructions and discussions.