Skip to content

Commit

Permalink
updates from latest
Browse files Browse the repository at this point in the history
  • Loading branch information
aaroncblack committed Mar 11, 2024
2 parents 848a42c + 6d9fa96 commit 92929da
Show file tree
Hide file tree
Showing 86 changed files with 32,232 additions and 741 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@ doc/sphinx/08_sparta/*.png
doc/sphinx/*/*.png
doc/sphinx/09_Microbenchmarks/*/*.png

doc/sphinx/07_miniem/PanzerMiniEM_BlockPrec.exe
doc/sphinx/07_miniem/*.xml
doc/sphinx/07_miniem/run-[0-9]*
doc/sphinx/07_miniem/runs--*

*.pyc
.vscode/
__pycache__/
Expand Down
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@
path = microbenchmarks/spatter
url = [email protected]:lanl/spatter.git
branch = main
[submodule "miniem_build/spack"]
path = miniem_build/spack
url = [email protected]:spack/spack
[submodule "kokkos-tools"]
path = kokkos-tools
url = [email protected]:kokkos/kokkos-tools.git
22 changes: 11 additions & 11 deletions doc/sphinx/00_intro/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -256,34 +256,34 @@ SSNI Weights and SSNI problem sizes
- **SSNI Weight**
- **SSNI Problem size - % device memory**
* - Branson
- TBD
- 10
- 25 to 30
* - AMG2023 Problem 1
- TBD
- 5
- 15 to 20
* - AMG2023 Problem 2
- TBD
- 5
- 15 to 20
* - MiniEM
- TBD
- 15
- TBD
* - MLMD Training
- TBD
- 5
- N/A
* - MLMD Simulation
- TBD
- 5
- 55 to 65
* - Parthenon-VIBE
- TBD
- 30
- 35 to 45
* - Sparta
- TBD
- TBD
- 10
- 50 to 60
* - UMT Problem 1
- TBD
- 7.5
- 45 to 55
* - UMT Problem 2
- TBD
- 7.5
- 45 to 55

Note: % of device memory is approximate please note actual memory footprint used.
Expand Down
52 changes: 33 additions & 19 deletions doc/sphinx/01_branson/branson.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,20 @@ Build requirements:

* If building a CUDA enabled version of Branson use the ``CUDADIR`` environment variable to specify your CUDA directory.

* If building for multi-node runs Metis should be used for mesh partitioning. See README.md from Branson for more details. Single node CPU and single node GPU runs for SSNI should not use Metis.

To build metis:

.. code-block:: bash
cd <path/to/metis>
make config cc=<C compiler> prefix=<install-location> shared=1
make install
..
To build branson:

.. code-block:: bash
export CXX=`which g++`
Expand Down Expand Up @@ -294,25 +308,6 @@ Strong scaling performance of Branson Crossroads 200M Particles is provided with

Branson Strong Scaling Performance on Crossroads 200M particles

Multi-node scaling
------------------

The results of the scaling runs performed on rocinante hbm partition nodes are presented below.
Branson was built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 110 tasks per node.
These runs use 85 million photons per node for a problem size using 25% of the total avalable memory across nodes.

.. figure:: branson_roci_scale.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling Branson
:file: branson_roci_scale.csv
:align: center
:widths: 10, 10, 10, 10
:header-rows: 1

AMD Epyc + Nvidia A100
----------------------
Throughput performance of Branson on AMD Epyc + Nvidia A100 (using a single GPU) is provided within the
Expand All @@ -331,6 +326,25 @@ following table and figure.

Branson Throughput Performance on AMD Epyc + Nvidia A100

Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition nodes are presented below.
Branson was built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 110 tasks per node.
These runs use 85 million photons per node for a problem size using 25% of the total avalable memory across nodes.

.. figure:: branson_roci_scale.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling Branson
:file: branson_roci_scale_header.csv
:align: center
:widths: 10, 10, 10, 10
:header-rows: 1

References
==========

Expand Down
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_badnodes_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Iteration,Photons,Nodes,Photons/s,Photons/s
1,2720,32,9.19E+07,2.87E+06
1,5440,64,1.92E+07,3.00E+05
1,8160,96,2.80E+07,2.91E+05
12 changes: 4 additions & 8 deletions doc/sphinx/01_branson/branson_roci_scale.csv
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
Photons,32,64,96
200,1.25e+08,2.42e+08,2.41e+08
400,1.26e+08,2.39e+08,3.42e+08
800,1.25e+08,2.39e+08,3.53e+08
1000,1.26e+08,2.37e+08,3.6e+08
2000,1.27e+08,2.46e+08,3.58e+08
4000,1.27e+08,2.52e+08,3.67e+08
8000,,2.54e+08,3.77e+08
Iteration,Photons,Nodes,Photons/s,Photons/s
1,2720,32,9.20E+07,2.87E+06
1,5440,64,1.89E+08,2.95E+06
1,8160,96,2.73E+08,2.85E+06
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_scale_header.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Photons,Photons/s,Photons/s/Node
32,2720,9.20E+07,2.87E+06
64,5440,1.89E+08,2.95E+06
96,8160,2.73E+08,2.85E+06
8 changes: 8 additions & 0 deletions doc/sphinx/01_branson/branson_roci_scale_photonrange.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Photons,32,64,96
200,1.25e+08,2.42e+08,2.41e+08
400,1.26e+08,2.39e+08,3.42e+08
800,1.25e+08,2.39e+08,3.53e+08
1000,1.26e+08,2.37e+08,3.6e+08
2000,1.27e+08,2.46e+08,3.58e+08
4000,1.27e+08,2.52e+08,3.67e+08
8000,,2.54e+08,3.77e+08
4 changes: 4 additions & 0 deletions doc/sphinx/01_branson/branson_roci_single_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Photons (in M),8,32,56,88,112
10,3.9e+05,1.12e+06,1.57e+06,2.47e+06,3.2e+06
66,5.4e+05,1.12e+06,1.52e+06,2.4e+06,3.16e+06
200,5.56e+05,1.12e+06,1.51e+06,2.4e+06,3.19e+06
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/branson_roci_single_scale_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,10,66,200
8,3.9e+05,5.4e+05,5.56e+05
32,1.12e+06,1.12e+06,1.12e+06
56,1.57e+06,1.52e+06,1.51e+06
88,2.47e+06,2.4e+06,2.4e+06
112,3.2e+06,3.16e+06,3.19e+06
16 changes: 16 additions & 0 deletions doc/sphinx/01_branson/branson_single_scale.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Iteration,Photons,Nodes,Photons/s
1,10,8,389526.944845
1,10,32,1118661.454450
1,10,56,1573534.905253
1,10,88,2474867.718320
1,10,112,3199066.384466
1,66,8,540217.631093
1,66,32,1123452.119160
1,66,56,1518091.720066
1,66,88,2399810.545866
1,66,112,3160933.921874
1,200,8,555655.848349
1,200,32,1121140.510231
1,200,56,1514059.293424
1,200,88,2398118.800503
1,200,112,3191200.914483
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/branson_single_scale_ideal.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,10,66,200
8,3.9e+05,5.4e+05,5.56e+05
32,1.56e+06,2.16e+06,2.22e+06
56,2.73e+06,3.78e+06,3.89e+06
88,4.28e+06,5.94e+06,6.11e+06
112,5.45e+06,7.56e+06,7.78e+06
30 changes: 27 additions & 3 deletions doc/sphinx/01_branson/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,34 @@ set output "cpu_200M.png"
#set title "Branson Strong Scaling Performance on Crossroads, 200M particles" font "serif,22"
plot "cpu_200M.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_10M_new.png"
plot "cpu_10M_new.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_66M_new.png"
plot "cpu_66M_new.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set output "cpu_200M_new.png"
plot "cpu_200M_new.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# Scaling Output
set output "branson_roci_scale.png"
set output "branson_roci_scale_range.png"
set xrange [200:8000]
set format y "%.1e"
unset logscale xy
set key title "Number of Nodes"
plot "branson_roci_scale.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2, "" using 1:4 with line linestyle 3
set key title "Nodes"
set title "Branson Multi Node Scaling" font "serif,22"
plot "branson_roci_scale_photonrange.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2, "" using 1:4 with line linestyle 3

# SCALING PLOTS, Y IS FOM PER NODE
set xrange [32:96]
set yrange [2.5e6:3.5e6]
set xlabel "Nodes"
set ylabel "FOM/node"
# set title "Branson Multi Node Scaling" font "serif,22"
set output "branson_roci_scale.png"
plot "branson_roci_scale.csv" using 3:5 with linespoints linestyle 1

set yrange [2e5:3e6]
set output "branson_roci_scale_badnodes.png"
set title "Branson Multi Node Scaling" font "serif,22"
plot "branson_roci_badnodes_scale.csv" using 3:5 with linespoints linestyle 1
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_10M_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,Actual,Ideal,Memory GB,Memory %
8,3.9e+05,3.9e+05,3,2.9
32,1.12e+06,1.56e+06,5,4.08
56,1.57e+06,2.73e+06,6,5.16
88,2.47e+06,4.28e+06,8,6.47
112,3.2e+06,5.45e+06,9,7.69
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_200M_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,Actual,Ideal,Memory GB,Memory %
8,5.56e+05,5.56e+05,46,37.07
32,1.12e+06,2.22e+06,48,39.03
56,1.51e+06,3.89e+06,50,40.23
88,2.4e+06,6.11e+06,51,41.54
112,3.19e+06,7.78e+06,53,42.75
6 changes: 6 additions & 0 deletions doc/sphinx/01_branson/cpu_66M_new.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Nodes,Actual,Ideal,Memory GB,Memory %
8,5.4e+05,5.4e+05,16,13.05
32,1.12e+06,2.16e+06,18,14.5
56,1.52e+06,3.78e+06,19,15.42
88,2.4e+06,5.94e+06,20,16.72
112,3.16e+06,7.56e+06,22,17.72
42 changes: 20 additions & 22 deletions doc/sphinx/02_amg/amg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -346,28 +346,6 @@ Approximate results of the FOM for varying memory usages on Crossroads are provi

Varying memory usage (estimated) for Problem 1 and 2


Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition are presented below.
Amg and hypre were built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 108 tasks per node.
Problems 1 and 2 were run with problem sizes per MPI process, `-n`, of 25,25,125 and 40,40,200 respectively to use 15% of available memory.
The product of the x,y,z process topology must equal the number of processors.
In this case, x=y=24 for all node counts and z was set to 6, 12, and 18 for 32, 64, and 96 nodes respectively.

.. figure:: cpu_scale_roci.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling AMG problem 1 and 2
:file: amg_scale_roci.csv
:align: center
:widths: 10, 10, 10
:header-rows: 1

V-100
=====

Expand Down Expand Up @@ -413,6 +391,26 @@ The FOMs of AMG2023 on V100 for Problem 2 is provided in the following table and

AMG2023 FOM on V100 for Problem 2 (7-pt stencil, AMG-PCG)

Multi-node scaling on Crossroads
================================

The results of the scaling runs performed on rocinante hbm partition are presented below.
Amg and hypre were built with intel oneapi 2023.1.0 and cray-mpich 8.1.25.
These runs used 32, 64, and 96 nodes with 108 tasks per node.
Problems 1 and 2 were run with problem sizes per MPI process, `-n`, of 38,38,38 and 60,60,60 respectively to use roughly 15% of available memory while maintaining a cubic grid.
The product of the x,y,z process topology must equal the number of processors.
In this case, x=y=24 for all node counts and z was set to 6, 12, and 18 for 32, 64, and 96 nodes respectively.

.. figure:: cpu_scale_roci_cubes.png
:align: center
:scale: 50%
:alt:

.. csv-table:: Multi Node Scaling AMG problem 1 and 2
:file: amg_scale_roci_cubes_pernode.csv
:align: center
:widths: 10, 10, 10, 10, 10
:header-rows: 1

References
==========
Expand Down
8 changes: 4 additions & 4 deletions doc/sphinx/02_amg/amg_scale_roci.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
NumNodes,Problem1,Problem2
96,7.3e+09,3.01e+09
64,5.44e+09,2.09e+09
32,3.29e+09,1.19e+09
NumNodes,Problem1,Problem2,Problem1,Problem2
96,7.30E+09,3.01E+09,7.60E+07,3.14E+07
64,5.44E+09,2.09E+09,8.50E+07,3.27E+07
32,3.29E+09,1.19E+09,1.03E+08,3.72E+07
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci_cubes.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Problem1,Problem2
32,6.636799e+09,2.000133e+09
64,2.034274e+09,3.288118e+08
96,2.840158e+09,4.669072e+08
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci_cubes_pernode.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Problem1,Problem2,Problem1/Node,Problem2/Node
32,6.64e+09,2e+09,2.07e+08,6.25e+07
64,2.03e+09,3.29e+08,3.18e+07,5.14e+06
96,2.84e+09,4.67e+08,2.96e+07,4.86e+06
4 changes: 4 additions & 0 deletions doc/sphinx/02_amg/amg_scale_roci_header.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Nodes,Problem1,Problem2,Problem1/Node,Problem2/Node
32,3.29E+09,1.19E+09,1.03E+08,3.72E+07
64,5.44E+09,2.09E+09,8.50E+07,3.27E+07
96,7.30E+09,3.01E+09,7.60E+07,3.14E+07
15 changes: 9 additions & 6 deletions doc/sphinx/02_amg/cpu.gp
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,14 @@ set output "roci_2_320.png"
set title "AMG2023 Strong Scaling for Problem 2, 320 x 320 x 320" font "serif,22"
plot "roci_2_320.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

# SCALING PLOTS, Y IS FOM PER NODE
unset logscale xy
set xrange [32:96]
set xlabel "Number of Nodes"
set yrange [1e5:3e8]
set xlabel "Nodes"
set format y "%.1e"
unset logscale xy
set output "cpu_scale_roci.png"
set title "AMG Multi Node Scaling" font "serif,22"
plot "amg_scale_roci.csv" using 1:2 with linespoints linestyle 1, "" using 1:3 with line linestyle 2

set ylabel "FOM/node"
unset title
set output "cpu_scale_roci_cubes.png"
# set title "AMG Multi Node Scaling" font "serif,22"
plot "amg_scale_roci_cubes_pernode.csv" using 1:4 with linespoints linestyle 1 title "Problem 1", "" using 1:5 with line linestyle 2 title "Problem 2"
Loading

0 comments on commit 92929da

Please sign in to comment.