Skip to content

Commit

Permalink
AMG and Branson plots updated for FOM/node
Browse files Browse the repository at this point in the history
  • Loading branch information
dmageeLANL committed Feb 7, 2024
1 parent 752f79e commit 0e017c3
Show file tree
Hide file tree
Showing 8 changed files with 78 additions and 56 deletions.
38 changes: 19 additions & 19 deletions doc/sphinx/01_branson/branson.rst
Original file line number Diff line number Diff line change
Expand Up @@ -294,25 +294,6 @@ Strong scaling performance of Branson Crossroads 200M Particles is provided with

Branson Strong Scaling Performance on Crossroads 200M particles

Multi-node scaling
------------------

The results of the scaling runs performed on rocinante hbm partition nodes are presented below.
Branson was built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 110 tasks per node.
These runs use 85 million photons per node for a problem size using 25% of the total avalable memory across nodes.

.. figure:: branson_roci_scale.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling Branson
:file: branson_roci_scale.csv
:align: center
:widths: 10, 10, 10, 10
:header-rows: 1

AMD Epyc + Nvidia A100
----------------------
Throughput performance of Branson on AMD Epyc + Nvidia A100 (using a single GPU) is provided within the
Expand All @@ -331,6 +312,25 @@ following table and figure.

Branson Throughput Performance on AMD Epyc + Nvidia A100

Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition nodes are presented below.
Branson was built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 110 tasks per node.
These runs use 85 million photons per node for a problem size using 25% of the total avalable memory across nodes.

.. figure:: branson_roci_scale.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling Branson
:file: branson_roci_scale.csv
:align: center
:widths: 10, 10, 10, 10, 10
:header-rows: 1

References
==========

Expand Down
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_badnodes_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Iteration,Photons,Nodes,Photons/s,Photons/s
1,2720,32,9.19E+07,2.87E+06
1,5440,64,1.92E+07,3.00E+05
1,8160,96,2.80E+07,2.91E+05
12 changes: 4 additions & 8 deletions doc/sphinx/01_branson/branson_roci_scale.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
Photons,32,64,96
200,1.25e+08,2.42e+08,2.41e+08
400,1.26e+08,2.39e+08,3.42e+08
800,1.25e+08,2.39e+08,3.53e+08
1000,1.26e+08,2.37e+08,3.6e+08
2000,1.27e+08,2.46e+08,3.58e+08
4000,1.27e+08,2.52e+08,3.67e+08
8000,,2.54e+08,3.77e+08
Iteration,Photons,Nodes,Photons/s,Photons/s
1,2720,32,9.20E+07,2.87E+06
1,5440,64,1.89E+08,2.95E+06
1,8160,96,2.73E+08,2.85E+06
8 changes: 8 additions & 0 deletions doc/sphinx/01_branson/branson_roci_scale_photonrange.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Photons,32,64,96
200,1.25e+08,2.42e+08,2.41e+08
400,1.26e+08,2.39e+08,3.42e+08
800,1.25e+08,2.39e+08,3.53e+08
1000,1.26e+08,2.37e+08,3.6e+08
2000,1.27e+08,2.46e+08,3.58e+08
4000,1.27e+08,2.52e+08,3.67e+08
8000,,2.54e+08,3.77e+08
18 changes: 16 additions & 2 deletions doc/sphinx/01_branson/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,23 @@ set output "cpu_200M.png"
plot "cpu_200M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# Scaling Output
set output "branson_roci_scale.png"
set output "branson_roci_scale_range.png"
set xrange [200:8000]
unset logscale xy
set key title "Number of Nodes"
plot "branson_roci_scale.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2, "" using 1:4 with line linestyle 3
plot "branson_roci_scale_photonrange.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2, "" using 1:4 with line linestyle 3

# SCALING PLOTS, Y IS FOM PER NODE
set xrange [32:96]
set yrange [2.5e6:3.5e6]
set xlabel "Number of Nodes"
set format y "%.1e"
set ylabel "FOM/node"
unset logscale xy
set output "branson_roci_scale.png"
set title "Branson Multi Node Scaling" font "serif,22"
plot "branson_roci_scale.csv" using 3:5 with linespoints linestyle 1

set output "branson_roci_scale_badnodes.png"
set title "Branson Multi Node Scaling" font "serif,22"
plot "branson_roci_badnodes_scale.csv" using 3:5 with linespoints linestyle 1
42 changes: 20 additions & 22 deletions doc/sphinx/02_amg/amg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -346,28 +346,6 @@ Approximate results of the FOM for varying memory usages on Crossroads are provi

Varying memory usage (estimated) for Problem 1 and 2


Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition are presented below.
Amg and hypre were built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 108 tasks per node.
Problems 1 and 2 were run with problem sizes per MPI process, `-n`, of 25,25,125 and 40,40,200 respectively to use 15% of available memory.
The product of the x,y,z process topology must equal the number of processors.
In this case, x=y=24 for all node counts and z was set to 6, 12, and 18 for 32, 64, and 96 nodes respectively.

.. figure:: cpu_scale_roci.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling AMG problem 1 and 2
:file: amg_scale_roci.csv
:align: center
:widths: 10, 10, 10
:header-rows: 1

V-100
=====

Expand Down Expand Up @@ -413,6 +391,26 @@ The FOMs of AMG2023 on V100 for Problem 2 is provided in the following table and

AMG2023 FOM on V100 for Problem 2 (7-pt stencil, AMG-PCG)

Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition are presented below.
Amg and hypre were built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 108 tasks per node.
Problems 1 and 2 were run with problem sizes per MPI process, `-n`, of 25,25,125 and 40,40,200 respectively to use 15% of available memory.
The product of the x,y,z process topology must equal the number of processors.
In this case, x=y=24 for all node counts and z was set to 6, 12, and 18 for 32, 64, and 96 nodes respectively.

.. figure:: cpu_scale_roci.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling AMG problem 1 and 2
:file: amg_scale_roci.csv
:align: center
:widths: 10, 10, 10, 10, 10
:header-rows: 1

References
==========
Expand Down
8 changes: 4 additions & 4 deletions doc/sphinx/02_amg/amg_scale_roci.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
NumNodes,Problem1,Problem2
96,7.3e+09,3.01e+09
64,5.44e+09,2.09e+09
32,3.29e+09,1.19e+09
NumNodes,Problem1,Problem2,Problem1,Problem2
96,7.30E+09,3.01E+09,7.60E+07,3.14E+07
64,5.44E+09,2.09E+09,8.50E+07,3.27E+07
32,3.29E+09,1.19E+09,1.03E+08,3.72E+07
4 changes: 3 additions & 1 deletion doc/sphinx/02_amg/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,13 @@ set output "roci_2_320.png"
set title "AMG2023 Strong Scaling for Problem 2, 320 x 320 x 320" font "serif,22"
plot "roci_2_320.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# SCALING PLOTS, Y IS FOM PER NODE
set xrange [32:96]
set xlabel "Number of Nodes"
set format y "%.1e"
set ylabel "FOM/node"
unset logscale xy
set output "cpu_scale_roci.png"
set title "AMG Multi Node Scaling" font "serif,22"
plot "amg_scale_roci.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2
plot "amg_scale_roci.csv" using 1:4 with linespoints linestyle 1, "" using 1:5 with line linestyle 2

0 comments on commit 0e017c3

Please sign in to comment.