Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run benchmarks weekly in CI #1245

Merged
merged 25 commits into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
5110a63
Create run benchmarks script and setup in CI
chapman39 Oct 9, 2024
f1e6916
Use environment variables for host configs and cmake args
chapman39 Oct 9, 2024
a6702b1
find host config path
chapman39 Oct 9, 2024
7c95a15
Mention path to shared caliper files
chapman39 Oct 9, 2024
1a1097b
increase ci benchmark alloc time
chapman39 Oct 9, 2024
0e93214
Check if petsc is enabled before running petsc multigrid
chapman39 Oct 10, 2024
8849962
return result of benchmarks
chapman39 Oct 10, 2024
77706dd
increase deadline time to allow for longer jobs to stay on the queue …
chapman39 Oct 10, 2024
e5af250
Merge remote-tracking branch 'origin/develop' into feature/chapman39/…
chapman39 Oct 25, 2024
5d266ef
Ensure all cray jobs are disabled until all developers have access
chapman39 Oct 25, 2024
d0fd0c3
increase alloc time on benchmarks now that they take longer
chapman39 Oct 25, 2024
cbcc38f
add alloc_deadline variable to all toss4 jobs
chapman39 Oct 25, 2024
9d95042
Merge branch 'develop' into feature/chapman39/run-benchmarks-ci
chapman39 Oct 28, 2024
f337dde
build benchmarks as release
chapman39 Oct 28, 2024
dd05693
Merge branch 'feature/chapman39/run-benchmarks-ci' of github.com:LLNL…
chapman39 Oct 28, 2024
faf3ff9
increase benchmark ctests to be 1.5 hours per job
chapman39 Oct 29, 2024
cfd2d33
Merge branch 'develop' into feature/chapman39/run-benchmarks-ci
chapman39 Oct 29, 2024
2ba2921
comment
chapman39 Oct 29, 2024
1853080
Merge branch 'feature/chapman39/run-benchmarks-ci' of github.com:LLNL…
chapman39 Oct 29, 2024
935cfbb
Merge branch 'develop' into feature/chapman39/run-benchmarks-ci
chapman39 Oct 29, 2024
9c1b3e4
set property to actual test not exec
chapman39 Oct 29, 2024
65a2769
use single variable to switch between ci pipelines
chapman39 Nov 4, 2024
2643068
set build_dir in a more straightforward way
chapman39 Nov 4, 2024
6a0efc1
attempt to improve variable names
chapman39 Nov 4, 2024
53cdc0c
fix on-push ci
chapman39 Nov 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 27 additions & 3 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,20 @@ variables:
.run_update_uberenv: &run_update_uberenv |
[[ -n "${UPDATE_UBERENV}" ]] && ./scripts/gitlab/update-uberenv.sh "${UPDATE_UBERENV}"

# Run src build each push
.src_workflow:
rules:
- if: '$FULL_BUILD != "ON"'
- if: $SERAC_CI_WORKFLOW_TYPE != "full" && $SERAC_CI_WORKFLOW_TYPE != "benchmarks"

# Run full build as a nightly scheduled pipeline
.full_workflow:
rules:
- if: '$FULL_BUILD == "ON"'
- if: $SERAC_CI_WORKFLOW_TYPE == "full"

# Run benchmarks build as a weekly scheduled pipeline
.benchmarks_workflow:
rules:
- if: $SERAC_CI_WORKFLOW_TYPE == "benchmarks"

####
# Templates
Expand Down Expand Up @@ -66,10 +73,27 @@ variables:
reports:
junit: ${FULL_BUILD_ROOT}/${SYS_TYPE}/*/_serac_build_and_test_*/build-*/junit.xml


.benchmarks_build_script:
script:
# Builds src, runs benchmarks, and stores Caliper files in shared location
- echo -e "section_start:$(date +%s):benchmarks_build\r\e[0K
Benchmarks Build ${CI_PROJECT_NAME}"
- ${ALLOC_COMMAND} python3 scripts/llnl/run_benchmarks.py
- echo -e "section_end:$(date +%s):benchmarks_build\r\e[0K"
artifacts:
expire_in: 2 weeks
when: always
paths:
- _serac_build_and_test_*/output.log*.txt
- _serac_build_and_test_*/build-*/output.log*.txt
- _serac_build_and_test_*/build-*/*.cali

# This is where jobs are included for each system
include:
- local: .gitlab/build_blueos.yml
- local: .gitlab/build_toss4.yml
- local: .gitlab/build_toss4_cray.yml
# Disabling cray until all developers have access to the tioga machine
# - local: .gitlab/build_toss4_cray.yml
- project: 'lc-templates/id_tokens'
file: 'id_tokens.yml'
12 changes: 12 additions & 0 deletions .gitlab/build_blueos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@
extends: [.full_build_script, .on_blueos, .full_workflow]
needs: []

.benchmarks_build_on_blueos:
extends: [.benchmarks_build_script, .on_blueos, .benchmarks_workflow]
needs: []

####
# Build jobs
blueos-clang_10_0_1-src:
Expand All @@ -58,3 +62,11 @@ blueos-clang_10_0_1-full:
ALLOC_NODES: "1"
ALLOC_TIME: "55"
extends: [.full_build_on_blueos, .with_cuda]

blueos-clang_10_0_1-benchmarks:
variables:
COMPILER: "[email protected]"
HOST_CONFIG: "lassen-blueos_3_ppc64le_ib_p9-${COMPILER}_cuda.cmake"
ALLOC_NODES: "1"
ALLOC_TIME: "120"
extends: [.benchmarks_build_on_blueos, .with_cuda]
32 changes: 29 additions & 3 deletions .gitlab/build_toss4.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# This is the shared configuration of jobs for toss4
.on_toss4:
variables:
SCHEDULER_PARAMETERS: "--res=ci --exclusive=user --deadline=now+1hour -N ${ALLOC_NODES} -t ${ALLOC_TIME} -A ${ALLOC_BANK}"
SCHEDULER_PARAMETERS: "--res=ci --exclusive=user --deadline=now+${ALLOC_DEADLINE}minutes -N ${ALLOC_NODES} -t ${ALLOC_TIME} -A ${ALLOC_BANK}"
tags:
- batch
- ruby
Expand All @@ -28,6 +28,10 @@
# LC version of pip is ancient
- if [[ $(python3 -c 'import pip; print(pip.__version__ < "19.3")') == "True" ]]; then python3 -m pip install --user --upgrade pip; fi

.benchmarks_build_on_toss4:
extends: [.benchmarks_build_script, .on_toss4, .benchmarks_workflow]
needs: []


####
# Build jobs
Expand All @@ -41,6 +45,7 @@ toss4-clang_14_0_6-src:
DO_INTEGRATION_TESTS: "yes"
ALLOC_NODES: "2"
ALLOC_TIME: "30"
ALLOC_DEADLINE: "60"
extends: .src_build_on_toss4

toss4-gcc_10_3_1-src:
Expand All @@ -50,6 +55,7 @@ toss4-gcc_10_3_1-src:
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON"
ALLOC_NODES: "1"
ALLOC_TIME: "30"
ALLOC_DEADLINE: "60"
extends: .src_build_on_toss4

toss4-gcc_10_3_1-src-no-tribol:
Expand All @@ -59,6 +65,7 @@ toss4-gcc_10_3_1-src-no-tribol:
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON -UTRIBOL_DIR"
ALLOC_NODES: "1"
ALLOC_TIME: "30"
ALLOC_DEADLINE: "60"
extends: .src_build_on_toss4

toss4-gcc_10_3_1-src-no-optional-solvers:
Expand All @@ -68,6 +75,7 @@ toss4-gcc_10_3_1-src-no-optional-solvers:
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON -USUNDIALS_DIR -UPETSC_DIR"
ALLOC_NODES: "1"
ALLOC_TIME: "20"
ALLOC_DEADLINE: "60"
extends: .src_build_on_toss4

toss4-clang_14_0_6-full:
Expand All @@ -76,7 +84,7 @@ toss4-clang_14_0_6-full:
SPEC: "--spec=%${COMPILER}"
ALLOC_NODES: "1"
ALLOC_TIME: "45"
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON"
ALLOC_DEADLINE: "60"
extends: .full_build_on_toss4

toss4-gcc_10_3_1-full:
Expand All @@ -85,5 +93,23 @@ toss4-gcc_10_3_1-full:
SPEC: "--spec=%${COMPILER}"
ALLOC_NODES: "1"
ALLOC_TIME: "45"
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON"
ALLOC_DEADLINE: "60"
extends: .full_build_on_toss4

toss4-clang_14_0_6-benchmarks:
variables:
COMPILER: "[email protected]"
HOST_CONFIG: "ruby-toss_4_x86_64_ib-${COMPILER}.cmake"
ALLOC_NODES: "1"
ALLOC_TIME: "120"
ALLOC_DEADLINE: "180"
extends: .benchmarks_build_on_toss4

toss4-gcc_10_3_1-benchmarks:
variables:
COMPILER: "[email protected]"
HOST_CONFIG: "ruby-toss_4_x86_64_ib-${COMPILER}.cmake"
ALLOC_NODES: "1"
ALLOC_TIME: "120"
ALLOC_DEADLINE: "180"
extends: .benchmarks_build_on_toss4
31 changes: 20 additions & 11 deletions .gitlab/build_toss4_cray.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,25 +25,34 @@
# LC version of pip is ancient
- if [[ $(python3 -c 'import pip; print(pip.__version__ < "19.3")') == "True" ]]; then python3 -m pip install --user --upgrade pip; fi

.benchmarks_build_on_toss4_cray:
extends: [.benchmarks_build_script, .on_toss4_cray, .benchmarks_workflow]
needs: []


####
# Build jobs

# Only run integration tests on one spec
#toss4_cray-clang_17_0_0-src:
# variables:
# COMPILER: "[email protected]"
# HOST_CONFIG: "tioga-toss_4_x86_64_ib_cray-${COMPILER}_hip.cmake"
# EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON"
# ALLOC_NODES: "1"
# ALLOC_TIME: "30"
# extends: .src_build_on_toss4_cray
toss4_cray-clang_17_0_0-src:
variables:
COMPILER: "[email protected]"
HOST_CONFIG: "tioga-toss_4_x86_64_ib_cray-${COMPILER}_hip.cmake"
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON"
ALLOC_NODES: "1"
ALLOC_TIME: "30"
extends: .src_build_on_toss4_cray

toss4_cray-clang_17_0_0-full:
variables:
COMPILER: "[email protected]"
SPEC: "--spec=%${COMPILER}"
ALLOC_NODES: "1"
ALLOC_TIME: "45"
EXTRA_CMAKE_OPTIONS: "-DENABLE_BENCHMARKS=ON"
extends: .full_build_on_toss4_cray

toss4_cray-clang_17_0_0-benchmarks:
variables:
COMPILER: "[email protected]"
HOST_CONFIG: "tioga-toss_4_x86_64_ib_cray-${COMPILER}_hip.cmake"
ALLOC_NODES: "1"
ALLOC_TIME: "120"
extends: .benchmarks_build_on_toss4_cray
20 changes: 1 addition & 19 deletions scripts/llnl/build_src.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,25 +132,7 @@ def main():
compiler = compiler.rsplit('-', 1)[0]
hostconfig = "%s-%s-%s.cmake" % (hostname, sys_type, compiler)

# First try with where uberenv generates host-configs.
hostconfig_path = os.path.join(repo_dir, hostconfig)
if not os.path.isfile(hostconfig_path):
print("[INFO: Looking for hostconfig at %s]" % hostconfig_path)
print("[WARNING: Spack generated host-config not found, trying with predefined]")

# Then look into project predefined host-configs.
hostconfig_path = os.path.join(repo_dir, "host-configs", hostconfig)
if not os.path.isfile(hostconfig_path):
print("[INFO: Looking for hostconfig at %s]" % hostconfig_path)
print("[WARNING: Predefined host-config not found, trying with Docker]")

# Otherwise look into project predefined Docker host-configs.
hostconfig_path = os.path.join(repo_dir, "host-configs", "docker", hostconfig)
if not os.path.isfile(hostconfig_path):
print("[INFO: Looking for hostconfig at %s]" % hostconfig_path)
print("[WARNING: Predefined Docker host-config not found]")
print("[ERROR: Could not find any host-configs in any known path. Try giving fully qualified path.]")
return 1
hostconfig_path = get_host_config_path(repo_dir, hostconfig)

test_root = get_build_and_test_root(repo_dir, timestamp)
os.mkdir(test_root)
Expand Down
24 changes: 24 additions & 0 deletions scripts/llnl/common_build_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -713,3 +713,27 @@ def convertSecondsToReadableTime(seconds):
m, s = divmod(seconds, 60)
h, m = divmod(m, 60)
return "%d:%02d:%02d" % (h, m, s)


def get_host_config_path(repo_dir, host_config):
# First try with where uberenv generates host-configs.
host_config_path = os.path.join(repo_dir, host_config)
if not os.path.isfile(host_config_path):
print("[INFO: Looking for host_config at %s]" % host_config_path)
print("[WARNING: Spack generated host-config not found, trying with predefined]")

# Then look into project predefined host-configs.
host_config_path = os.path.join(repo_dir, "host-configs", host_config)
if not os.path.isfile(host_config_path):
print("[INFO: Looking for host_config at %s]" % host_config_path)
print("[WARNING: Predefined host-config not found, trying with Docker]")

# Otherwise look into project predefined Docker host-configs.
host_config_path = os.path.join(repo_dir, "host-configs", "docker", host_config)
if not os.path.isfile(host_config_path):
print("[INFO: Looking for host_config at %s]" % host_config_path)
print("[WARNING: Predefined Docker host-config not found]")
print("[ERROR: Could not find any host-configs in any known path. Try giving fully qualified path.]")
sys.exit(1)

return host_config_path
103 changes: 103 additions & 0 deletions scripts/llnl/run_benchmarks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#!/bin/sh
"exec" "python3" "-u" "-B" "$0" "$@"

# Copyright (c) 2019-2024, Lawrence Livermore National Security, LLC and
# other Serac Project Developers. See the top-level LICENSE file for details.
#
# SPDX-License-Identifier: (BSD-3-Clause)

"""
file: run_benchmarks.py

description:
Run benchmarks and update shared (or any desired) location with new Caliper files

"""

from common_build_functions import *

from argparse import ArgumentParser

import os


def parse_args():
"Parses args from command line"
parser = ArgumentParser()
parser.add_argument("-e", "--extra-cmake-options",
dest="extra_cmake_options",
default=os.environ.get("EXTRA_CMAKE_OPTIONS", ""),
help="Extra cmake options to add to the cmake configure line. Note that options to enable benchmarks, " +
"disable docs, and build as Release are always appended.")
parser.add_argument("-hc", "--host-config",
dest="host_config",
default=os.environ.get("HOST_CONFIG", None),
help="Specific host-config filename to build (defaults to HOST_CONFIG environment variable)")
parser.add_argument("-sd", "--spot-directory",
dest="spot_dir",
default=get_shared_spot_dir(),
help="Where to put all resulting caliper files to use for SPOT analysis (defaults to a shared location)")
parser.add_argument("-t", "--timestamp",
dest="timestamp",
default=get_timestamp(),
help="Set timestamp manually for debugging")

# Parse args
args, extra_args = parser.parse_known_args()
args = vars(args)

# Verify args
if args["host_config"] is None:
print("[ERROR: Both host_config argument and HOST_CONFIG environment variable unset!]")
sys.exit(1)

return args


def main():
# Args
args = parse_args()
cmake_options = args["extra_cmake_options"] + " -DENABLE_BENCHMARKS=ON -DENABLE_DOCS=OFF -DCMAKE_BUILD_TYPE=Release"
host_config = args["host_config"]
spot_dir = args["spot_dir"]
timestamp = args["timestamp"]

# Vars
repo_dir = get_repo_dir()
test_root = get_build_and_test_root(repo_dir, timestamp)
host_config_path = get_host_config_path(repo_dir, host_config)
host_config_root = get_host_config_root(host_config)
benchmarks_output_file = os.path.join(test_root, "output.log.%s.benchmarks.txt" % host_config_root)
build_dir = os.path.join(test_root, "build-%s" % host_config_root)

# Build Serac
os.chdir(repo_dir)
os.makedirs(test_root, exist_ok=True)
build_and_test_host_config(test_root=test_root, host_config=host_config_path,
report_to_stdout=True, extra_cmake_options=cmake_options,
skip_install=True, skip_tests=True)

# Go to build location
os.chdir(build_dir)

# Run benchmarks
result = shell_exec("make run_benchmarks", echo=True, print_output=True, output_file=benchmarks_output_file)

# Move resulting .cali files to specified directory
os.makedirs(spot_dir, exist_ok=True)
cali_files = glob.glob(pjoin(build_dir, "*.cali"))
for cali_file in cali_files:
if os.path.exists(cali_file):
shutil.copy2(cali_file, spot_dir)

# Print SPOT url
if on_rz():
print("[View SPOT directory here: https://rzlc.llnl.gov/spot2/?sf={0}]".format(spot_dir))
else:
print("[View SPOT directory here: https://lc.llnl.gov/spot2/?sf={0}]".format(spot_dir))

return result;


if __name__ == "__main__":
sys.exit(main())
2 changes: 2 additions & 0 deletions src/docs/sphinx/dev_guide/profiling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,8 @@ files:
- `SPOT CZ <https://lc.llnl.gov/spot2>`_
- `SPOT RZ <https://rzlc.llnl.gov/spot2>`_

The shared Caliper files for Serac are located here: https://lc.llnl.gov/spot2/?sf=/usr/WS2/smithdev/califiles

.. note::
There is a bug in SPOT where if you remove Caliper files from a directory, they still show up on SPOT - if you've
visualized them previously. The current workaround is by removing the ``llnl.gov`` site cache manually.
Loading