Skip to content

Commit

Permalink
Merge pull request #65 from ulrikeyang/main
Browse files Browse the repository at this point in the history
Updated FOM and replaced results
  • Loading branch information
ulrikeyang authored Oct 24, 2023
2 parents af80a19 + 179eeb0 commit 4a71178
Show file tree
Hide file tree
Showing 11 changed files with 94 additions and 189 deletions.
107 changes: 6 additions & 101 deletions doc/sphinx/02_amg/amg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Purpose
=======

The AMG2023 benchmark consists of a driver (amg.c), a simple Makefile, and documentation. It is available at https://github.com/LLNL/AMG2023 .
It requires an installation of hypre 2.27.0.
It requires an installation of hypre 2.27.0 or higher.
It uses hypre's parallel algebraic multigrid (AMG) solver BoomerAMG in combination with a Krylov solver to solve
two linear systems arising from diffusion problems on a cuboid discretized by finite differences.
The problems are set up through hypre's linear-algebraic IJ interface. The problem sizes can be controlled from the command line.`.
Expand All @@ -30,10 +30,11 @@ The problem sizes for both problems can be set by the user from the command line
Figure of Merit
---------------

The figures of merit (FOM_setup and FOM_solve) are calculated using the total number of nonzeros for all system matrices and interpolation operators on all levels of AMG, AMG setup wall clock time (FOM_setup), and AMG solve phase time and number of iterations (FOM_solve).
FOM_setup and FOM_solve are qualitatively different. FOM_solve represents a number in the order of the throughput of an average iteration in the solve phase, i.e., approximately flops times a constant. FOM_setup is more complex, since the setup phase also contains many integer computations and if statements to generate data structures, determine neighbor processes, etc. All of this is already available in the solve phase. For this reason, we will report both below.
The figure of merit (FOM) is calculated using the total number of nonzeros for all system matrices and interpolation operators on all levels of AMG (NNZ), AMG setup wall clock time (Setup_time), and AMG solve phase wall clock time (Solve_time).
Since in time dependent problems the AMG preconditioner might be used for several solves before it needs to be reevaluated, a parameter k has also been included to simulate computation of a time dependent problem that reuses the preconditioner for an average of k time steps.

The total FOM is evaluated as follows: FOM = (FOM_setup + FOM_solve)/2.
The total FOM is evaluated as follows: FOM = NNZ / (Setup_time + k * Solve_time).
The parameter k is set to 1 in Problem 1 and to 3 in Problem 2.

Building
========
Expand Down Expand Up @@ -337,110 +338,14 @@ V-100
-----

We have also performed runs on 1 NVIDIA V-100 GPU increasing the problem size n x n x n.
For these runs hypre 2.27.0 was configured as follows:
For these runs hypre 2.29.0 was configured as follows:

``configure --with-cuda``

We increased n by 10 starting with n=50 for Problem 1 and with n=80 for Problem 2 until we ran out of memory.
Note that Problem 2 uses much less memory, since the original matrix has at most 7 coefficients per row vs 27 for Problem 1.
In addition, aggressive coarsening is used on the first level, significantly decreasing memory usage at the cost of increased number of iterations.

.. table:: FOMs, times and number of iterations for Problem 1 with grid size n x n x n on 1 V-100

+---------+-----------+-----------+-----------+------------+------------+------------+
| n | FOM | FOM_setup | FOM_solve | setup time | solve time | iterations |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 50 | 2.701E+09 | 9.708E+07 | 5.304E+09 | 0.068 | 0.024 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 60 | 3.654E+09 | 1.348E+08 | 7.172E+09 | 0.086 | 0.031 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 70 | 4.745E+09 | 1.504E+08 | 9.340E+09 | 0.123 | 0.038 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 80 | 4.582E+09 | 2.000E+08 | 8.964E+09 | 0.139 | 0.059 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 90 | 5.987E+09 | 2.190E+08 | 1.176E+10 | 0.181 | 0.064 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 100 | 6.574E+09 | 2.702E+08 | 1.288E+10 | 0.202 | 0.080 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 110 | 6.856E+09 | 3.026E+08 | 1.341E+10 | 0.240 | 0.103 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 120 | 7.181E+09 | 3.359E+08 | 1.403E+10 | 0.281 | 0.128 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 130 | 7.377E+09 | 3.709E+08 | 1.438E+10 | 0.324 | 0.159 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 140 | 7.425E+09 | 3.907E+08 | 1.446E+10 | 0.385 | 0.198 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 150 | 7.630E+09 | 4.108E+08 | 1.485E+10 | 0.451 | 0.237 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 160 | 7.738E+09 | 4.255E+08 | 1.505E+10 | 0.528 | 0.284 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 170 | 7.812E+09 | 4.372E+08 | 1.519E+10 | 0.617 | 0.338 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 180 | 7.878E+09 | 4.429E+08 | 1.531E+10 | 0.724 | 0.398 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 190 | 7.895E+09 | 4.526E+08 | 1.534E+10 | 0.834 | 0.468 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 200 | 7.957E+09 | 4.593E+08 | 1.546E+10 | 0.959 | 0.542 | 19 |
+---------+-----------+-----------+-----------+------------+------------+------------+


.. table:: FOMs, times and number of iterations for Problem 2 with grid size n x n x n on 1 V-100

+---------+-----------+-----------+-----------+------------+------------+------------+
| n | FOM | FOM_setup | FOM_solve | setup time | solve time | iterations |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 80 | 2.669E+09 | 5.841E+07 | 5.280E+09 | 0.096 | 0.032 | 30 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 90 | 3.063E+09 | 6.953E+07 | 6.057E+09 | 0.115 | 0.038 | 29 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 100 | 3.481E+09 | 8.562E+07 | 6.876E+09 | 0.135 | 0.047 | 30 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 110 | 3.831E+09 | 9.717E+07 | 7.564E+09 | 0.153 | 0.060 | 31 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 120 | 3.693E+09 | 1.068E+08 | 7.279E+09 | 0.178 | 0.081 | 31 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 130 | 4.375E+09 | 1.126E+08 | 8.636E+09 | 0.215 | 0.087 | 31 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 140 | 4.547E+09 | 1.284E+08 | 8.967E+09 | 0.236 | 0.105 | 31 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 150 | 4.753E+09 | 1.448E+08 | 9.361E+09 | 0.257 | 0.127 | 32 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 160 | 4.879E+09 | 1.598E+08 | 9.600E+09 | 0.273 | 0.150 | 32 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 170 | 4.985E+09 | 1.685E+08 | 9.801E+09 | 0.322 | 0.183 | 33 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 180 | 5.094E+09 | 1.702E+08 | 1.001E+10 | 0.366 | 0.213 | 33 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 190 | 5.158E+09 | 1.874E+08 | 1.013E+10 | 0.405 | 0.247 | 33 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 200 | 5.191E+09 | 1.996E+08 | 1.018E+10 | 0.444 | 0.287 | 33 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 210 | 5.239E+09 | 2.071E+08 | 1.027E+10 | 0.495 | 0.330 | 33 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 220 | 5.185E+09 | 2.123E+08 | 1.016E+10 | 0.556 | 0.383 | 33 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 230 | 5.173E+09 | 2.176E+08 | 1.013E+10 | 0.620 | 0.453 | 34 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 240 | 5.148E+09 | 2.227E+08 | 1.007E+10 | 0.688 | 0.517 | 34 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 250 | 5.139E+09 | 2.285E+08 | 1.005E+10 | 0.758 | 0.586 | 34 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 260 | 5.168E+09 | 2.293E+08 | 1.011E+10 | 0.850 | 0.656 | 34 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 270 | 5.173E+09 | 2.311E+08 | 1.012E+10 | 0.945 | 0.756 | 35 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 280 | 5.198E+09 | 2.356E+08 | 1.016E+10 | 1.034 | 0.839 | 35 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 290 | 5.221E+09 | 2.382E+08 | 1.020E+10 | 1.137 | 0.929 | 35 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 300 | 5.230E+09 | 2.419E+08 | 1.022E+10 | 1.239 | 1.027 | 35 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 310 | 5.246E+09 | 2.435E+08 | 1.025E+10 | 1.359 | 1.130 | 35 |
+---------+-----------+-----------+-----------+------------+------------+------------+
| 320 | 5.255E+09 | 2.447E+08 | 1.027E+10 | 1.487 | 1.241 | 35 |
+---------+-----------+-----------+-----------+------------+------------+------------+


The FOMs of AMG2023 on V100 for Problem 1 is provided in the following table and figure:

.. csv-table:: AMG2023 FOM on V100 for Problem 1 (27-pt stencil, AMG-GMRES)
Expand Down
32 changes: 16 additions & 16 deletions doc/sphinx/02_amg/gpu1.csv
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
n,FOM
50,2.701E+09
60,3.654E+09
70,4.745E+09
80,4.582E+09
90,5.987E+09
100,6.574E+09
110,6.856E+09
120,7.181E+09
130,7.377E+09
140,7.425E+09
150,7.630E+09
160,7.738E+09
170,7.812E+09
180,7.878E+09
190,7.895E+09
200,7.957E+09
50,7.135E+07
60,9.863E+07
70,1.199E+08
80,1.397E+08
90,1.629E+08
100,1.858E+08
110,2.104E+08
120,2.317E+08
130,2.496E+08
140,2.597E+08
150,2.668E+08
160,2.733E+08
170,2.827E+08
180,2.858E+08
190,2.890E+08
200,2.925E+08
50 changes: 25 additions & 25 deletions doc/sphinx/02_amg/gpu2.csv
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
n,FOM
80,2.669E+09
90,3.063E+09
100,3.481E+09
110,3.831E+09
120,3.693E+09
130,4.375E+09
140,4.547E+09
150,4.753E+09
160,4.879E+09
170,4.985E+09
180,5.094E+09
190,5.158E+09
200,5.191E+09
210,5.239E+09
220,5.185E+09
230,5.173E+09
240,5.148E+09
250,5.139E+09
260,5.168E+09
270,5.173E+09
280,5.198E+09
290,5.221E+09
300,5.230E+09
310,5.246E+09
320,5.255E+09
80,2.931E+07
90,3.493E+07
100,4.070E+07
110,4.437E+07
120,4.511E+07
130,5.104E+07
140,5.510E+07
150,5.842E+07
160,6.075E+07
170,6.276E+07
180,6.530E+07
190,6.652E+07
200,6.823E+07
210,6.903E+07
220,6.949E+07
230,6.821E+07
240,6.856E+07
250,6.871E+07
260,6.910E+07
270,6.801E+07
280,6.825E+07
290,6.916E+07
300,6.932E+07
310,6.955E+07
320,6.978E+07
2 changes: 1 addition & 1 deletion doc/sphinx/02_amg/mem.gp
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ set ylabel "FOM"
set xrange [10:40]
set key left top

set yrange [1.05e+8: 1.75e+8]
set yrange [1.0e+6: 1.0e+7]
set grid
show grid

Expand Down
14 changes: 7 additions & 7 deletions doc/sphinx/02_amg/roci_1_120.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
No. cores,Actual,Ideal
1,1.6281E+08,1.6281E+08
2,3.1584E+08,3.2562E+08
4,5.6892E+08,6.5124E+08
8,1.2243E+09,1.3025E+09
16,2.1890E+09,2.6050E+09
32,2.9544E+09,5.2099E+09
64,4.5612E+09,1.0420E+10
1,8.6459E+06,8.6459E+06
2,1.4987E+07,1.7292E+07
4,2.9222E+07,3.4583E+07
8,5.5766E+07,6.9167E+07
16,9.5407E+07,1.3833E+08
32,1.4029E+08,2.7667E+08
64,2.2887E+08,5.5333E+08
14 changes: 7 additions & 7 deletions doc/sphinx/02_amg/roci_1_160.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
No. cores,Actual,Ideal
1,1.6000E+08,1.6000E+08
2,3.0493E+08,3.2000E+08
4,5.6314E+08,6.4000E+08
8,1.1901E+09,1.2800E+09
16,2.1349E+09,2.5600E+09
32,2.9680E+09,5.1200E+09
64,4.7134E+09,1.0240E+10
1,8.4644E+06,8.4644E+06
2,1.2983E+07,1.6929E+07
4,2.7064E+07,3.3857E+07
8,5.0436E+07,6.7715E+07
16,1.0227E+08,1.3543E+08
32,1.3856E+08,2.7086E+08
64,2.3692E+08,5.4172E+08
14 changes: 7 additions & 7 deletions doc/sphinx/02_amg/roci_1_200.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
No. cores,Actual,Ideal
1,1.5948E+08,1.5948E+08
2,2.9978E+08,3.1896E+08
4,5.4708E+08,6.3792E+08
8,1.0820E+09,1.2758E+09
16,2.0509E+09,2.5517E+09
32,2.8251E+09,5.1034E+09
64,4.6217E+09,1.0207E+10
1,8.4267E+06,8.4267E+06
2,1.2526E+07,1.6853E+07
4,2.4576E+07,3.3707E+07
8,5.0598E+07,6.7413E+07
16,9.3217E+07,1.3483E+08
32,1.2682E+08,2.6965E+08
64,2.3377E+08,5.3931E+08
14 changes: 7 additions & 7 deletions doc/sphinx/02_amg/roci_2_200.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
No. cores,Actual,Ideal
1,1.1020E+08,1.1020E+08
2,2.0493E+08,2.2040E+08
4,3.8499E+08,4.4080E+08
8,7.9992E+08,8.8160E+08
16,1.2667E+09,1.7632E+09
32,1.7586E+09,3.5264E+09
64,2.9247E+09,7.0528E+09
1,1.7980E+06,1.7980E+06
2,3.2662E+06,3.5961E+06
4,6.2277E+06,7.1922E+06
8,1.2149E+07,1.4384E+07
16,2.2796E+07,2.8769E+07
32,2.8885E+07,5.7538E+07
64,4.6850E+07,1.1508E+08
14 changes: 7 additions & 7 deletions doc/sphinx/02_amg/roci_2_256.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
No. cores,Actual,Ideal
1,1.0864E+08,1.0864E+08
2,1.9807E+08,2.1728E+08
4,3.7525E+08,4.3456E+08
8,7.3751E+08,8.6912E+08
16,1.3348E+09,1.7382E+09
32,1.7869E+09,3.4765E+09
64,2.8334E+09,6.9530E+09
1,1.7267E+06,1.7267E+06
2,3.0559E+06,3.4535E+06
4,5.8681E+06,6.9069E+06
8,1.1919E+07,1.3814E+07
16,2.0471E+07,2.7628E+07
32,2.7253E+07,5.5255E+07
64,4.5270E+07,1.1051E+08
14 changes: 7 additions & 7 deletions doc/sphinx/02_amg/roci_2_320.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
No. cores,Actual,Ideal
1,1.0890E+08,1.0890E+08
2,1.8653E+08,2.1780E+08
4,3.6953E+08,4.3560E+08
8,7.2172E+08,8.7120E+08
16,1.3751E+09,1.7424E+09
32,1.7869E+09,3.4848E+09
64,2.8376E+09,6.9696E+09
1,1.6485E+06,1.6485E+06
2,2.8577E+06,3.2970E+06
4,5.3917E+06,6.5940E+06
8,1.1154E+07,1.3188E+07
16,2.1099E+07,2.6376E+07
32,2.6207E+07,5.2752E+07
64,4.2568E+07,1.0550E+08
8 changes: 4 additions & 4 deletions doc/sphinx/02_amg/roci_mem.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
GB,Problem 1,Problem 2
10,1.6469E+08,1.1119E+08
20,1.6157E+08,1.1076E+08
30,1.6042E+08,1.1016E+08
40,1.5985E+08,1.1010E+08
10,8.6128E+06,1.8494E+06
20,8.4654E+06,1.7930E+06
30,8.4068E+06,1.7416E+06
40,8.3174E+06,1.7258E+06

0 comments on commit 4a71178

Please sign in to comment.