-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trilinos: Fix/Update build stats tools #8638
Trilinos: Fix/Update build stats tools #8638
Conversation
@jjellio, thanks for adding this PR! I will review and test on a few systems. I will then enable this in the ATDM Trilinos builds and the PR builds (for this PR) to ensure it works everywhere (which is the initial goal).
We can address that in a follow-up PR. We need to install different wrappers because they need to point to the installed
If we make it only Python 3, then there are some Trilinos PR builds that we can't enable the build wrappers since they use Python 2.7. If you support Python 2.7 and 3.x, then if you always look for 'python' that will either by the default Python 2.7 on RHEL7-based systems or will be some version of Python 3. Either way, that is fine if you write your Python code to work with Python 2.7 and Python 3.x. I will take a look.
That is easy to checks for by looking at the var
Would not be hard to manually determine that. We could, for example, set this to a dummy program and then grep the I will post my initial review next. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjellio, from my prior extensive portability testing of this as part of PR #7508, I am concerned this will not work in all of those builds due to:
- Moving
usr_bin_time_csv_map
back inside ofclass WrapperOpTimer:
- Changing from
#!/usr/bin/env python
to#!/usr/bin/env python3
Therefore, before this gets merged, we diffidently need to:
- Test this in all of the Trilinos PR builds
- Test this in all of the ATDM Trilinos builds
The former is easy. I will just turn on these build-stats wrappers in all of the Trilinos PR builds on this branch and push.
The latter is more difficult since we don't want to break the ATDM Trilinos builds (even for a day). So that will require going to each major platform and manually running all of the builds and posting to CDash.
Also, I am concerned about the warning message about overwriting existing *.timing
files since this will be common if doing a rebuild (unless all of the *.timing
files get wiped out when configuring with -D Trilinos_REMOVE_BUILD_STATS_TIMING_FILES_ON_FRESH_CONFIGURE=ON
). Can we at least make that an optional debug-level warning that is off by default? (We could enable it if the env var MAGIC_WRAPPER_WARN_ON_OVERWRITING_STATS_FILE=1
is set in the user's env?) Otherwise, developers are going to see a ton of these warnings when doing rebuilds in their local build directory and they are going to hate us.
I will create a new branch and PR against this PR for my suggested changes for all of the above for you @jjellio to review and then merge into this branch and PR.
But first I will enable PR testing of what is here so we can see what happens in the PR builds.
# Generate the build stats compiler wrapper for a ar/ld/ranlib | ||
# | ||
# the only difference between this and lang, is setting the proper | ||
# Cmake variable name, e.g., CMAKE_LANG_COMPILER vs CMAKE_OP | ||
# we can resolve this by taking in name of the variable to set | ||
function(generate_build_stats_wrapper_for_op op_name variable_to_set) | ||
|
||
string(TOLOWER "${op_name}" op_lc) | ||
set(compiler_wrapper | ||
"${${PROJECT_NAME}_BINARY_DIR}/build_stat_${op_lc}_wrapper.sh") | ||
|
||
# Override the compiler with the wrapper but remember the original compiler | ||
if ("${${variable_to_set}_ORIG}" STREQUAL "") | ||
set(${variable_to_set}_ORIG "${op_lc}" | ||
CACHE FILEPATH "Original non-wrappeed ${op_name}" FORCE ) | ||
set(${variable_to_set} "${compiler_wrapper}" | ||
CACHE FILEPATH "Overwritten build stats ${op_name} compiler wrapper" FORCE ) | ||
endif() | ||
|
||
message("-- " "Generate build stats compiler wrapper for ${op_name}") | ||
set(BUILD_STAT_COMPILER_WRAPPER_INNER_COMPILER "${${variable_to_set}_ORIG}") | ||
configure_file("${BUILD_STATS_SRC_DIR}/build_stat_lang_wrapper.sh.in" | ||
"${compiler_wrapper}" @ONLY) | ||
print_var(${variable_to_set}) | ||
|
||
# Use the orginal compiler for the installed <XXX>Config.cmake files | ||
# doubt this works w/AR/LD/RANLIB | ||
set(${variable_to_set}_COMPILER_FOR_CONFIG_FILE_INSTALL_DIR | ||
"${${variable_to_set}_ORIG}" CACHE INTERNAL "") | ||
print_var(${variable_to_set}_COMPILER_FOR_CONFIG_FILE_INSTALL_DIR) | ||
|
||
endfunction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definently try to eliminate this duplication. I will take a crack at that and push new commits to this PR branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edit - this is bonkers - it put my reply in this comment for no good reason)
@@ -57,6 +93,7 @@ macro(get_base_build_dir_for_python) | |||
WORKING_DIRECTORY ${PROJECT_BINARY_DIR} | |||
OUTPUT_VARIABLE BASE_BUILD_DIR_FOR_PYTHON | |||
OUTPUT_STRIP_TRAILING_WHITESPACE) | |||
print_var(BASE_BUILD_DIR_FOR_PYTHON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove this print statement or at least comment this out since I guess this was just for local debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, yes. That popped up due to python not being found
CACHE FILEPATH "Original non-wrappeed ${op_name}" FORCE ) | ||
set(CMAKE_${op_name} "${compiler_wrapper}" | ||
set(${variable_to_set} "${compiler_wrapper}" | ||
CACHE FILEPATH "Overwritten build stats ${op_name} compiler wrapper" FORCE ) | ||
endif() | ||
|
||
message("-- " "Generate build stats compiler wrapper for ${op_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the wrong message if this is not wrapping the compiler. Instead, we should just take "compiler" out of this and other such comments and printed messages.
CACHE FILEPATH "Overwritten build stats ${op_name} compiler wrapper" FORCE ) | ||
endif() | ||
|
||
message("-- " "Generate build stats compiler wrapper for ${op_name}") | ||
set(BUILD_STAT_COMPILER_WRAPPER_INNER_COMPILER "${CMAKE_${op_name}_ORIG}") | ||
set(BUILD_STAT_COMPILER_WRAPPER_INNER_COMPILER "${${variable_to_set}_ORIG}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And COMPILER
needs to be replaced with OPT
since this is not always a compiler now.
generate_build_stats_wrapper_for_op(LD "CMAKE_LD") | ||
generate_build_stats_wrapper_for_op(AR "CMAKE_AR") | ||
generate_build_stats_wrapper_for_op(RANLIB "CMAKE_RANLIB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't really need quotes around the second args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there are no spaces, <something>
is identical to `"" when passing in arguments to a CMake command. Putting quotes around things only changes the meaning in an if() statements in certain cases (i.e. if a string is interpreted as a literal vs. the name of a variable). Except for if() statements (and perhaps in a few other special cases), everything is a string in CMake and if there are no spaces, then there is no need for surrounding quotes.
@jjellio, I am testing this locally and it looks like it is broken. I just got:
I will fix and push to your branch. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: jjellio |
@jjellio, note that we are already seeing errors in the Python
This might be due to the this script using |
Some comments on Python. I had an Ubuntu machine that I worked on in December and a different one in January. In that time, they removed the symlink from Do you have a better than idea than 2 versions of the scripts? Ideally, the 'top level' variable definitions drive the logic (e.g., regex patterns, dicts) - if that got moved to a universal python file (or properties file), then maybe only the logic would get replicated (and changed). |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 3350 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 974 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 1468 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 8821 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_10.1.105 # 297 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 1663 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_python_3 # 4376 (click to expand)
|
So on your machine there is no In any case, you can fix this for your account by doing:
Then when you set up your env just add:
That is what I am doing to get a Does that work for you? |
Ross, I can make python found - the problem is there seems to be ambiguity as to whether 1) I don't really follow this kind of news, so I have no idea what the broader landscape will do. on SNL machines we can fix it, but those things are usually short term (as production machines outside our reach are added). I fixed all of this by setting
That opens the door to module problems (loaded X later used Y). I guess the wrapper could add another check
This is pretty obnoxious |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: jjellio |
As long as we are using RHEL7 at SNL we will always have a 'python' in the path. If we write our Python code to work with Python 2.7 or 3.x, we should be good to go (as long as the envs are not setting For now, let's just do what works on SNL systems and the people can hack their env to get it working on other systems (like creating a symlink Perhaps the best thing to do to make the build wrappers explicitly call Python as:
instead of as:
in build_stat_lang_wrapper.sh.in? I think that solves the problem. (No one should ever call |
Sorry, I closed this on accident. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105_uvm_off
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: jjellio |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105_uvm_off
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 4767 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 2297 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 2778 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 10082 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_10.1.105 # 1494 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_10.1.105_uvm_off # 491 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 2859 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_python_3 # 5427 (click to expand)
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105_uvm_off
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: jjellio |
CC: @jjellio At the Trilinos Developers meeting last Tuesday on 6/15/2021, @jwillenbring seemed to give his approval for turning on these build stats wrappers for the PR builds. Therefore, I will leave the commit f7f0ff2 that enables the build stats wrappers in all of the PR builds and allow this PR to merge. And with PR testing finally working again, I think this thing will merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are safe to merge and allow these build stats wrappers to run in all ATDM Trilinos and PR builds submitting data to CDash.
I approve!
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105_uvm_off
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 4784 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 2314 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 2795 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 10092 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_10.1.105 # 1504 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_10.1.105_uvm_off # 501 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 2869 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_python_3 # 5436 (click to expand)
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105_uvm_off
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: jjellio |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_10.1.105_uvm_off
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ bartlettroscoe ]! |
Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged |
Merge on Pull Request# 8638: IS A SUCCESS - Pull Request successfully merged |
Addresses more of #7376
This patch is rather monolithic, but it isn't easy to decouple things.
The general fixes are:
The cmake changes have some redundant code - the function to generate a 'lang' is nearly identical to what I wrote for 'ar' - I generalized it to the point that we could use the more generic form I have. It effectively takes in an extra string variable to know which
CMAKE_
variable to set.I've tested this with a simple Epetra build that builds some Fortran code.
Issues:
On PopOS, cmake fails to find the python executable, which then leaves the base
BUILD_DIR
undefined at wrapper creation time. This seems to be a PopOS/Cmake problem, so for testing I set-DPYTHON_EXECUTABLE=$(which python3)
The cmake code I added does not install the ar/ld/ranlib wrappers - this should be a simple change in cmake, but testing w/installations should probably be a separate thing.
Trilinos (and this code) will probably have to commit to Python3 (and forget Python2). There could be issues in the future w/ writing the proper 'shebang' for running the python wrappers - they may need to have CMake output PYTHON_EXECUTABLE as input into a configure file so the shebang is correct on various platforms. (and so we know python3 is used)
Generating AR/RANLIB wrappers should be skipped if the build is Shared, but it is should be benign to set them anyway.
Not sure if CMAKE_LD is ever used (by us), but I did generate a wrapper for it.
This code was developed on a recent pull from Develop w/GCC 10 (not tested w/intel)
@bartlettroscoe
Tasks:
build_stats.csv
file due to different fields (or ordering of fields) in the*.timing
files (see here).gather_build_stats.py -d <dir-base> -o <build-stats-csv>
tool to allow*.timing
files with different sets of headers (in different orders) and remove the./
at the beginning of the paths.[ ] Disable build stats wrappers in all of the PR builds on this branch.... @jwillenbring said it was okay to enable these in the PR builds for now