Skip to content

Commit

Permalink
update parthenon perf numbers and SSNI baseline draft, other minor fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
gshipman committed Mar 17, 2024
1 parent d203e5c commit 6f76dad
Show file tree
Hide file tree
Showing 6 changed files with 27 additions and 28 deletions.
Binary file modified doc/sphinx/00_intro/SSNI-baseline-draft.xlsx
Binary file not shown.
6 changes: 3 additions & 3 deletions doc/sphinx/03_vibe/cpu_20.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
No. Cores, Actual, Ideal
8, 2.00e+06, 2.0e+06
32, 7.40e+06, 8.0e+06
56, 1.29e+07, 1.4e+07
8, 3.40e+06, 3.40e+06
32, 1.19e+07, 1.36e+07
56, 1.88e+07, 2.38e+07
10 changes: 5 additions & 5 deletions doc/sphinx/03_vibe/cpu_40.csv
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
No. Cores, Actual, Ideal
8, 1.82e+06, 1.82e+06
32, 7.04e+06, 7.28e+06
56, 1.21e+07, 1.274e+07
88, 1.60e+07, 2.02e+07
112, 2.00e+07, 2.548e+07
8, 2.80e+06, 2.80e+06
32, 1.12e+07, 1.12e+07
56, 1.79e+07, 1.96e+07
88, 2.36e+07, 3.08e+07
112, 2.61e+07, 3.92e+07
10 changes: 5 additions & 5 deletions doc/sphinx/03_vibe/cpu_60.csv
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
No. Cores, Actual, Ideal
8, 1.51e+06, 1.51e+06
32, 6.34e+06, 6.04e+06
56, 1.09e+07, 1.057e+07
88, 1.55e+07, 1.661e+07
112, 1.85e+07, 2.114e+07
8, 2.40e+06, 2.40e+06
32, 9.56e+06, 9.60e+06
56, 1.54e+07, 1.68e+07
88, 2.16e+07, 2.64e+07
112, 2.44e+07, 3.36e+07
10 changes: 5 additions & 5 deletions doc/sphinx/03_vibe/gpu.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Mesh Base Size, Actual
32, 1.75e+07
64, 1.15e+07
96, 6.78e+06
128, 0
160, 0
32, 2.88e+07
64, 2.19e+07
96, 1.41e+07
128, 1.36e+07
160, 1.03e+07
192, 0
19 changes: 9 additions & 10 deletions doc/sphinx/03_vibe/vibe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,7 @@ To build Parthenon on CPU, including this benchmark, with minimal external depen
.. code-block:: bash
parthenon$ mkdir build && cd build
build$ export CXXFLAGS="-fno-math-errno -march=native"
build$ cmake -DPARTHENON_DISABLE_HDF5=ON -DPARTHENON_ENABLE_PYTHON_MODULE_CHECK=OFF -DREGRESSION_GOLD_STANDARD_SYNC=OFF -DCMAKE_BUILD_TYPE=Release ../
build$ cmake -DPARTHENON_DISABLE_HDF5=ON -DPARTHENON_ENABLE_PYTHON_MODULE_CHECK=OFF -DREGRESSION_GOLD_STANDARD_SYNC=OFF -DPARTHENON_ENABLE_TESTING=OFF -DCMAKE_BUILD_TYPE=Release ../
build$ make -j
..
Expand All @@ -81,11 +80,11 @@ On Crossroads the relevant modules for the results shown here are:
..
To build for execution on a single GPU, it should be sufficient to add the following flags to the CMake configuration line
To build for execution on a single GPU, it should be sufficient to add flags similar to the CMake configuration line

.. code-block:: bash
cmake -DPARTHENON_DISABLE_MPI=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE80=ON
cmake -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE80=ON
..
Expand Down Expand Up @@ -123,7 +122,7 @@ The results presented here use 128 and 160 for memory footprints of approximate
Results from Parthenon are provided on the following systems:
* Crossroads (see :ref:`GlobalSystemATS3`)
* An Nvidia A100 GPU hosted on an [Nvidia Arm HPC Developer Kit](https://developer.nvidia.com/arm-hpc-devkit)
* A Grace Hopper (Grace ARM CPU 72 cores with 120GB, H100 GPU with 96GB)
The mesh and meshblock size parameters are chosen to balance
realism/performance with memory footprint. For the following tests we
Expand Down Expand Up @@ -182,12 +181,12 @@ Crossroads
VIBE Throughput Performance on Crossroads using ~60% memory
Nvidia testbed with A100
Nvidia Grace Hopper
------------------------
Throughput performance of Parthenon-VIBE on a 40GB A100 is provided within the following table and figure.
Throughput performance of Parthenon-VIBE on a 96 GB H100 is provided within the following table and figure.
.. csv-table:: VIBE Throughput Performance on A100
.. csv-table:: VIBE Throughput Performance on H100
:file: gpu.csv
:align: center
:widths: 10, 10
Expand All @@ -196,9 +195,9 @@ Throughput performance of Parthenon-VIBE on a 40GB A100 is provided within the f
.. figure:: gpu.png
:align: center
:scale: 50%
:alt: VIBE Throughput Performance on A100
:alt: VIBE Throughput Performance on H100
VIBE Throughput Performance on A100
VIBE Throughput Performance on H100
Multi-node scaling on Crossroads
Expand Down

0 comments on commit 6f76dad

Please sign in to comment.