Skip to content

Commit

Permalink
Merge branch 'main' of github.com:lanl/benchmarks
Browse files Browse the repository at this point in the history
  • Loading branch information
Galen M. Shipman authored and Galen M. Shipman committed Sep 28, 2023
2 parents f3ea37c + d5e9294 commit b7d6d7f
Show file tree
Hide file tree
Showing 492 changed files with 203,687 additions and 589 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ doc/sphinx/_build
doc/sphinx/output.log
doc/sphinx/7_miniem/*.png
doc/sphinx/8_sparta/*.png
doc/sphinx/*/*.png
doc/sphinx/10_Microbenchmarks/*/*.png

*.pyc
.vscode/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ benchmarks prior to RFP.

To use these benchmarks please refer to the ATS-5 benchmarks repository `ATS-5 repo <https://github.com/lanl/benchmarks>`_

Benchmark changes from Crossroads
The benchmarks will, eventually, be generated atop Crossroads as the reference
system (see :ref:`ReferenceCrossroads` for more information).

Benchmark Changes from Crossroads
=================================

The key differences from Crossroads benchmarks and ATS-5 benchmarks are as summarized below:
Expand Down Expand Up @@ -216,11 +219,56 @@ Optional results:

Scaled Single Node Improvement
==============================
One element of evaluation will focus on scaled single node improvement:
One element of evaluation will focus on scaled single node improvement (SSNI). SSNI is defined as follows:

Given two platforms using one as a reference (Crossroads), SSNI is defined as a weighted geometric mean using the following equation.

.. math::
SSNI = N(\prod_{i=1}^{M}(S_i)^{w_i})^\frac{1}{\sum_{i=1}^{M}{W_i}}
Where:

* N = Number of nodes on ATS-5 system / Number of nodes on reference system (Crossroads),

* M = total number of Benchmarks,

* S = application speedup; Figure of Merit on ATS-5 system / Figure of Merit on reference system (Crossroads); S must be greater than 1,

* w = weighting factor.


System Information
==================

The baseline platform for the ATS-5 procurement is the ATS-3 system (described below).
GPU performance is provided on the ATS-2 system and in some cases other GPU based systems
and is for information only, these are not to be used as baselines.
In most cases the performance numbers provided herein were collected on smaller scale
testbed systems that are the same architecture as that of ATS-3 and ATS-2 systems.

* Advanced Technology System 3 (ATS-3), also known as Crossroads (see :ref:`GlobalSystemATS3`)
* Advanced Technology System 2 (ATS-2), also known as Sierra (see :ref:`GlobalSystemATS2`)


.. _GlobalSystemATS3:

ATS-3/Crossroads
----------------

This system has over 6,140 compute nodes that are made up of two Intel(R) Xeon(R) Max 9480 CPUs
interconnected with HPE Slingshot 11 interconnect.

.. _GlobalSystemATS2:

ATS-2/Sierra
------------

This system has 4,284 compute nodes that are made up of two Power9
CPUs with four NVIDIA V100 GPUs. Please refer to [Sierra-LLNL]_ for more
detailed information.

* Weighted average of Single node performance improvement * number of nodes
* Multiple node configurations can be proposed and may vary on single node performance improvement and number of nodes
* Baseline will be SSNI of Crossroads SPR-HBM


Approvals
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ This is the documentation for the ATS-5 Benchmark Branson - 3D hohlraum single n
Purpose
=======

From their [site]_:
From their [Branson]_:

Branson is not an acronym.

Expand Down Expand Up @@ -103,76 +103,87 @@ Running
..
For strong scaling on a CPU the memory footprint of Branson must be between 28% and 34% of the computational device's main memory.
The memory footprint can be controlled by editing "photons" in the input file.
On a dual socket Intel Haswell (E5-2695 v4 2.10GHz) with 128GByte of total system memory using 120000000 photons is ~41.1GByte (Resident set size) or approximately %32.7.
For strong scaling on a CPU, Branson should be run with three different problem sizes such that the memory
footprint at the smallest process count per node is approximately: 4 to 5%, 8 to 10%, and 20 to 22%; during step 2 of the simulation.
Memory footprint is the sum of all Branson processes resident set size (or equivalent) on the node.
This can be obtained on a CPU system using the following (while the application is in step 2):

.. code-block:: bash
ps -C BRANSON -o euser,c,pid,ppid,cmd,%cpu,%mem,rss --sort=-rss
ps -C BRANSON -o rss | awk '{sum+=$1;} END{print sum/1024/1024;}'
..

For throughput curves on a GPU the memory footprint of Branson must vary between 5% and 90% in increments of at most 5% of the computational device's main memory.


For throughput curves on a GPU the memory footprint of Branson must vary between ~5% and ~60% in increments of at most 5% of the computational device's main memory.
The memory footprint can be controlled by editing "photons" in the input file.


Results from Branson are provided on the following systems:

* Commodity Technology System 1 (CTS-1) with Intel Broadwell processors,
* IBM Power9 with Nvidia V100 GPU,
* Crossroads (see :ref:`GlobalSystemATS3`)
* Sierra (see :ref:`GlobalSystemATS2`)

CTS-1
Crossroads
------------
Strong scaling performance of Branson CTS-1 66M Particles is provided within the following table and
Strong scaling performance of Crossroads 66M Particles is provided within the following table and
figure.

.. csv-table:: Branson Strong Scaling Performance on CTS-1 66M particles
:file: cpu_66M.csv
.. csv-table:: Branson Strong Scaling Performance on Crossroads 10M particles
:file: cpu_10M.csv
:align: center
:widths: 10, 10, 10
:widths: 10, 10, 10, 10, 10
:header-rows: 1

.. figure:: cpu_66M.png
.. figure:: cpu_10M.png
:align: center
:scale: 50%
:alt: Branson Strong Scaling Performance on CTS-1 66M particles
:alt: Branson Strong Scaling Performance on Crossroads 10M particles

Branson Strong Scaling Performance on CTS-1 66M particles
Branson Strong Scaling Performance on Crossroads 10M particles

Strong scaling performance of Branson CTS-1 133M Particles is provided within the following table and
Strong scaling performance of Branson Crossroads 66M Particles is provided within the following table and
figure.

.. csv-table:: Branson Strong Scaling Performance on CTS-1 133M particles
:file: cpu_133M.csv
.. csv-table:: Branson Strong Scaling Performance on Crossroads 66M particles
:file: cpu_66M.csv
:align: center
:widths: 10, 10, 10
:widths: 10, 10, 10, 10, 10
:header-rows: 1

.. figure:: cpu_133M.png
.. figure:: cpu_66M.png
:align: center
:scale: 50%
:alt: Branson Strong Scaling Performance on CTS-1 133M particles
:alt: Branson Strong Scaling Performance on Crossroads 66M particles

Branson Strong Scaling Performance on CTS-1 133M particles
Branson Strong Scaling Performance on Crossroads 66M particles

Strong scaling performance of Branson CTS-1 200M Particles is provided within the following table and
Strong scaling performance of Branson Crossroads 200M Particles is provided within the following table and
figure.

.. csv-table:: Branson Strong Scaling Performance on CTS-1 200M particles
.. csv-table:: Branson Strong Scaling Performance on Crossroads 200M particles
:file: cpu_200M.csv
:align: center
:widths: 10, 10, 10
:widths: 10, 10, 10, 10, 10
:header-rows: 1

.. figure:: cpu_200M.png
:align: center
:scale: 50%
:alt: Branson Strong Scaling Performance on CTS-1 200M particles
:alt: Branson Strong Scaling Performance on Crossroads 200M particles

Branson Strong Scaling Performance on CTS-1 200M particles
Branson Strong Scaling Performance on Crossroads 200M particles

Power9+V100
Sierra
------------

Throughput performance of Branson on Power9+V100 is provided within the
Throughput performance of Branson on Sierra is provided within the
following table and figure.

.. csv-table:: Branson Throughput Performance on Power9+V100
.. csv-table:: Branson Throughput Performance on Sierra
:file: gpu.csv
:align: center
:widths: 10, 10
Expand All @@ -181,8 +192,8 @@ following table and figure.
.. figure:: gpu.png
:align: center
:scale: 50%
:alt: Branson Throughput Performance on Power9+V100
Branson Throughput Performance on Power9+V100
:alt: Branson Throughput Performance on Sierra
Branson Throughput Performance on Sierra


Verification of Results
Expand All @@ -191,4 +202,4 @@ Verification of Results
References
==========

.. [site] Alex R. Long, 'Branson', 2023. [Online]. Available: https://github.com/lanl/branson. [Accessed: 22- Feb- 2023]
.. [Branson] Alex R. Long, 'Branson', 2023. [Online]. Available: https://github.com/lanl/branson. [Accessed: 22- Feb- 2023]
35 changes: 35 additions & 0 deletions doc/sphinx/01_branson/cpu.gp
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/usr/bin/gnuplot
set terminal pngcairo enhanced size 1024, 768 dashed font 'Helvetica,18'
set output "cpu_10M.png"

#set title "Branson Strong Scaling Performance on Crossroads, 10M particles" font "serif,22"
set xlabel "No. Processing Elements"
set ylabel "Figure of Merit (particles/sec)"

set xrange [8:112]
set key left top

set logscale x 2
set logscale y 2

set grid
show grid

set datafile separator comma
set key autotitle columnheader

set style line 1 linetype 6 dashtype 1 linecolor rgb "#FF0000" linewidth 2 pointtype 6 pointsize 3
set style line 2 linetype 1 dashtype 2 linecolor rgb "#FF0000" linewidth 2

plot "cpu_10M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_66M.png"
#set title "Branson Strong Scaling Performance on Crossroads, 66M particles" font "serif,22"
plot "cpu_66M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2


set output "cpu_200M.png"
#set title "Branson Strong Scaling Performance on Crossroads, 200M particles" font "serif,22"
plot "cpu_200M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2


6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_10M.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
No. Cores, Actual, Ideal, Memory (GB), Memory (%)
8,8.85e+04,8.85e+04, 4.8, 3.75
32,3.48e+05,3.54e+05, --, --
56,5.61e+05,6.19e+05, --, --
88,7.52e+05,9.73e+05, --, --
112,9.08e+05,1.24e+06, 52.27, 40.8
File renamed without changes.
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_200M.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
No. Cores, Actual, Ideal, Memory (GB)
8,9.27E+04,9.27e+04, 26.026, 20.3
32,3.80E+05,3.80E+05, --, --
56,5.80E+05,6.44E+05, --, --
88,7.90E+05,1.01E+06, --, --
112,9.59E+05,1.29E+06, 73.46, 57.3
File renamed without changes.
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_66M.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
No. Cores, Actual, Ideal, Memory
8,9.20E+04,9.20E+04, 11.04
32,3.62E+05,3.68E+05, --
56,5.76E+05,6.44E+05, --
88,7.74E+05,1.01E+06, --
112,9.52E+05,1.29E+06, 58.44
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
set terminal pngcairo enhanced size 1024, 768 dashed font 'Helvetica,18'
set output "gpu.png"

set title "Branson Throughput Performance on Power9+V100" font "serif,22"
#set title "Branson Throughput Performance on Sierra" font "serif,22"
set xlabel "No. Particles"
set ylabel "Figure of Merit (Particles/sec)"

Expand Down
Loading

0 comments on commit b7d6d7f

Please sign in to comment.