Skip to content

Commit

Permalink
Merge branch 'main' into tmp-miniem
Browse files Browse the repository at this point in the history
  • Loading branch information
amagela committed Mar 5, 2024
2 parents dc228ee + 478976d commit 71e5f60
Show file tree
Hide file tree
Showing 47 changed files with 30,302 additions and 171 deletions.
52 changes: 25 additions & 27 deletions doc/sphinx/00_intro/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ Microbenchmark Overview
.. _GlobalRunRules:

Run Rules Synopsis
===============
==================

Single node benchmarks will require respondent to provide estimates on

Expand Down Expand Up @@ -256,39 +256,37 @@ SSNI Weights and SSNI problem sizes
- **SSNI Weight**
- **SSNI Problem size - % device memory**
* - Branson
- TBD
- 30
* - AMG2023 Problem 1 Setup
- TBD
- 20
* - AMG2023 Problem 2 Setup
- TBD
- 20
* - AMG2023 Problem 1 Solve
- TBD
- 20
* - AMG2023 Problem 2 Solve
- TBD
- 20
- 10
- 25 to 30
* - AMG2023 Problem 1
- 5
- 15 to 20
* - AMG2023 Problem 2
- 5
- 15 to 20
* - MiniEM
- TBD
- 15
- TBD
* - MLMD Training
- TBD
- 5
- N/A
* - MLMD Simulation
- TBD
- 60
- 5
- 55 to 65
* - Parthenon-VIBE
- TBD
- 40
- 30
- 35 to 45
* - Sparta
- TBD
- TBD
* - UMT
- TBD
- TBD

- 10
- 50 to 60
* - UMT Problem 1
- 7.5
- 45 to 55
* - UMT Problem 2
- 7.5
- 45 to 55

Note: % of device memory is approximate please note actual memory footprint used.

System Information
==================
Expand Down
37 changes: 35 additions & 2 deletions doc/sphinx/01_branson/branson.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,20 @@ Build requirements:

* If building a CUDA enabled version of Branson use the ``CUDADIR`` environment variable to specify your CUDA directory.

* If building for multi-node runs Metis should be used for mesh partitioning. See README.md from Branson for more details. Single node CPU and single node GPU runs for SSNI should not use Metis.

To build metis:

.. code-block:: bash
cd <path/to/metis>
make config cc=<C compiler> prefix=<install-location> shared=1
make install
..
To build branson:

.. code-block:: bash
export CXX=`which g++`
Expand Down Expand Up @@ -245,7 +259,8 @@ lose their energy into the material.


Crossroads
------------
----------

Strong scaling performance of Crossroads 10M Particles is provided within the following table and
figure.

Expand Down Expand Up @@ -293,7 +308,6 @@ Strong scaling performance of Branson Crossroads 200M Particles is provided with

Branson Strong Scaling Performance on Crossroads 200M particles


AMD Epyc + Nvidia A100
----------------------
Throughput performance of Branson on AMD Epyc + Nvidia A100 (using a single GPU) is provided within the
Expand All @@ -312,6 +326,25 @@ following table and figure.

Branson Throughput Performance on AMD Epyc + Nvidia A100

Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition nodes are presented below.
Branson was built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 110 tasks per node.
These runs use 85 million photons per node for a problem size using 25% of the total avalable memory across nodes.

.. figure:: branson_roci_scale.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling Branson
:file: branson_roci_scale_header.csv
:align: center
:widths: 10, 10, 10, 10
:header-rows: 1

References
==========

Expand Down
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_badnodes_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Iteration,Photons,Nodes,Photons/s,Photons/s
1,2720,32,9.19E+07,2.87E+06
1,5440,64,1.92E+07,3.00E+05
1,8160,96,2.80E+07,2.91E+05
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Iteration,Photons,Nodes,Photons/s,Photons/s
1,2720,32,9.20E+07,2.87E+06
1,5440,64,1.89E+08,2.95E+06
1,8160,96,2.73E+08,2.85E+06
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_scale_header.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Photons,Photons/s,Photons/s/Node
32,2720,9.20E+07,2.87E+06
64,5440,1.89E+08,2.95E+06
96,8160,2.73E+08,2.85E+06
8 changes: 8 additions & 0 deletions doc/sphinx/01_branson/branson_roci_scale_photonrange.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Photons,32,64,96
200,1.25e+08,2.42e+08,2.41e+08
400,1.26e+08,2.39e+08,3.42e+08
800,1.25e+08,2.39e+08,3.53e+08
1000,1.26e+08,2.37e+08,3.6e+08
2000,1.27e+08,2.46e+08,3.58e+08
4000,1.27e+08,2.52e+08,3.67e+08
8000,,2.54e+08,3.77e+08
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_single_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Photons (in M),8,32,56,88,112
10,3.9e+05,1.12e+06,1.57e+06,2.47e+06,3.2e+06
66,5.4e+05,1.12e+06,1.52e+06,2.4e+06,3.16e+06
200,5.56e+05,1.12e+06,1.51e+06,2.4e+06,3.19e+06
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/branson_roci_single_scale_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,10,66,200
8,3.9e+05,5.4e+05,5.56e+05
32,1.12e+06,1.12e+06,1.12e+06
56,1.57e+06,1.52e+06,1.51e+06
88,2.47e+06,2.4e+06,2.4e+06
112,3.2e+06,3.16e+06,3.19e+06
16 changes: 16 additions & 0 deletions doc/sphinx/01_branson/branson_single_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Iteration,Photons,Nodes,Photons/s
1,10,8,389526.944845
1,10,32,1118661.454450
1,10,56,1573534.905253
1,10,88,2474867.718320
1,10,112,3199066.384466
1,66,8,540217.631093
1,66,32,1123452.119160
1,66,56,1518091.720066
1,66,88,2399810.545866
1,66,112,3160933.921874
1,200,8,555655.848349
1,200,32,1121140.510231
1,200,56,1514059.293424
1,200,88,2398118.800503
1,200,112,3191200.914483
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/branson_single_scale_ideal.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,10,66,200
8,3.9e+05,5.4e+05,5.56e+05
32,1.56e+06,2.16e+06,2.22e+06
56,2.73e+06,3.78e+06,3.89e+06
88,4.28e+06,5.94e+06,6.11e+06
112,5.45e+06,7.56e+06,7.78e+06
32 changes: 31 additions & 1 deletion doc/sphinx/01_branson/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,46 @@ set key autotitle columnheader

set style line 1 linetype 6 dashtype 1 linecolor rgb "#FF0000" linewidth 2 pointtype 6 pointsize 3
set style line 2 linetype 1 dashtype 2 linecolor rgb "#FF0000" linewidth 2
set style line 3 linetype 6 dashtype 1 linecolor rgb "#0000FF" linewidth 2 pointtype 6 pointsize 3

plot "cpu_10M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_66M.png"
#set title "Branson Strong Scaling Performance on Crossroads, 66M particles" font "serif,22"
plot "cpu_66M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2


set output "cpu_200M.png"
#set title "Branson Strong Scaling Performance on Crossroads, 200M particles" font "serif,22"
plot "cpu_200M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_10M_new.png"
plot "cpu_10M_new.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_66M_new.png"
plot "cpu_66M_new.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_200M_new.png"
plot "cpu_200M_new.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# Scaling Output
set output "branson_roci_scale_range.png"
set xrange [200:8000]
set format y "%.1e"
unset logscale xy
set key title "Nodes"
set title "Branson Multi Node Scaling" font "serif,22"
plot "branson_roci_scale_photonrange.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2, "" using 1:4 with line linestyle 3

# SCALING PLOTS, Y IS FOM PER NODE
set xrange [32:96]
set yrange [2.5e6:3.5e6]
set xlabel "Nodes"
set ylabel "FOM/node"
# set title "Branson Multi Node Scaling" font "serif,22"
set output "branson_roci_scale.png"
plot "branson_roci_scale.csv" using 3:5 with linespoints linestyle 1

set yrange [2e5:3e6]
set output "branson_roci_scale_badnodes.png"
set title "Branson Multi Node Scaling" font "serif,22"
plot "branson_roci_badnodes_scale.csv" using 3:5 with linespoints linestyle 1
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_10M_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,Actual,Ideal,Memory GB,Memory %
8,3.9e+05,3.9e+05,3,2.9
32,1.12e+06,1.56e+06,5,4.08
56,1.57e+06,2.73e+06,6,5.16
88,2.47e+06,4.28e+06,8,6.47
112,3.2e+06,5.45e+06,9,7.69
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_200M_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,Actual,Ideal,Memory GB,Memory %
8,5.56e+05,5.56e+05,46,37.07
32,1.12e+06,2.22e+06,48,39.03
56,1.51e+06,3.89e+06,50,40.23
88,2.4e+06,6.11e+06,51,41.54
112,3.19e+06,7.78e+06,53,42.75
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_66M_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,Actual,Ideal,Memory GB,Memory %
8,5.4e+05,5.4e+05,16,13.05
32,1.12e+06,2.16e+06,18,14.5
56,1.52e+06,3.78e+06,19,15.42
88,2.4e+06,5.94e+06,20,16.72
112,3.16e+06,7.56e+06,22,17.72
26 changes: 22 additions & 4 deletions doc/sphinx/02_amg/amg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ The second figure provides memory use on 1 node of CTS-1 (Quartz) using 4 MPI ta


Strong Scaling on Crossroads
----------------------------
============================

We present strong scaling results for varying problem sizes on Crossroads with HBM below. The code was configured and compiled using hypre v2.29.0 with MPI only and optimization -O2.

Expand Down Expand Up @@ -346,10 +346,8 @@ Approximate results of the FOM for varying memory usages on Crossroads are provi

Varying memory usage (estimated) for Problem 1 and 2



V-100
-----
=====

We have also performed runs on 1 NVIDIA V-100 GPU increasing the problem size n x n x n.
For these runs hypre 2.29.0 was configured as follows:
Expand Down Expand Up @@ -393,6 +391,26 @@ The FOMs of AMG2023 on V100 for Problem 2 is provided in the following table and

AMG2023 FOM on V100 for Problem 2 (7-pt stencil, AMG-PCG)

Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition are presented below.
Amg and hypre were built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 108 tasks per node.
Problems 1 and 2 were run with problem sizes per MPI process, `-n`, of 38,38,38 and 60,60,60 respectively to use roughly 15% of available memory while maintaining a cubic grid.
The product of the x,y,z process topology must equal the number of processors.
In this case, x=y=24 for all node counts and z was set to 6, 12, and 18 for 32, 64, and 96 nodes respectively.

.. figure:: cpu_scale_roci_cubes.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling AMG problem 1 and 2
:file: amg_scale_roci_cubes_pernode.csv
:align: center
:widths: 10, 10, 10, 10, 10
:header-rows: 1

References
==========
Expand Down
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
NumNodes,Problem1,Problem2,Problem1,Problem2
96,7.30E+09,3.01E+09,7.60E+07,3.14E+07
64,5.44E+09,2.09E+09,8.50E+07,3.27E+07
32,3.29E+09,1.19E+09,1.03E+08,3.72E+07
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci_cubes.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Problem1,Problem2
32,6.636799e+09,2.000133e+09
64,2.034274e+09,3.288118e+08
96,2.840158e+09,4.669072e+08
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci_cubes_pernode.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Problem1,Problem2,Problem1/Node,Problem2/Node
32,6.64e+09,2e+09,2.07e+08,6.25e+07
64,2.03e+09,3.29e+08,3.18e+07,5.14e+06
96,2.84e+09,4.67e+08,2.96e+07,4.86e+06
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci_header.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Problem1,Problem2,Problem1/Node,Problem2/Node
32,3.29E+09,1.19E+09,1.03E+08,3.72E+07
64,5.44E+09,2.09E+09,8.50E+07,3.27E+07
96,7.30E+09,3.01E+09,7.60E+07,3.14E+07
12 changes: 12 additions & 0 deletions doc/sphinx/02_amg/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ set key autotitle columnheader

set style line 1 linetype 6 dashtype 1 linecolor rgb "#FF0000" linewidth 2 pointtype 6 pointsize 3
set style line 2 linetype 1 dashtype 2 linecolor rgb "#FF0000" linewidth 2
set style line 3 linetype 6 dashtype 1 linecolor rgb "#0000FF" linewidth 2 pointtype 6 pointsize 3

plot "roci_1_120.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

Expand All @@ -43,3 +44,14 @@ set output "roci_2_320.png"
set title "AMG2023 Strong Scaling for Problem 2, 320 x 320 x 320" font "serif,22"
plot "roci_2_320.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# SCALING PLOTS, Y IS FOM PER NODE
unset logscale xy
set xrange [32:96]
set yrange [1e5:3e8]
set xlabel "Nodes"
set format y "%.1e"
set ylabel "FOM/node"
set output "cpu_scale_roci_cubes.png"
set title "AMG Multi Node Scaling" font "serif,22"
plot "amg_scale_roci_cubes_pernode.csv" using 1:4 with linespoints linestyle 1, "" using 1:5 with line linestyle 2

21 changes: 21 additions & 0 deletions doc/sphinx/03_vibe/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,24 @@ plot "cpu_40.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line
set output "ats3_60.png"
plot "cpu_60.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# Scaling Output
set output "parthenon_roci_scale_range.png"
set xrange [380:650]
unset logscale xy
set format y "%.1e"
set xlabel "NX (NX=nx=ny=nz)"
set key title "Nodes"
set title "Parthenon Multi Node Scaling" font "serif,22"
plot "parthenon_roci_scale_nxrange.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2, "" using 1:4 with line linestyle 3

# SCALING PLOTS, Y IS FOM PER NODE

set xrange [32:96]
set yrange [7e6:1.5e7]
set xlabel "Nodes"
set ylabel "FOM/node"
unset title
unset key
# set title "Branson Multi Node Scaling" font "serif,22"
set output "parthenon_roci_scale_pernode.png"
plot "parthenon_roci_scale_pernode.csv" using 1:5 with linespoints linestyle 1
Loading

0 comments on commit 71e5f60

Please sign in to comment.