-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build.sh: Add option to log nvcc compile times #1262
build.sh: Add option to log nvcc compile times #1262
Conversation
An example of what the output looks like on PR #1254 : |
Okay.. This is causing segfaults in CI.. That is a disappointment.. |
In addition to the segfault, we have other challenges right now trying to extract the ninja_logs from the build for each package in CI (conda build wants to create its own working directory and then copy to a final directory when done, which overwrites the existing files each time). Maybe for now we could add the --time flag as an option in build.sh (I'm thinking maybe build.sh -t to enable it? What do you think?). I'd also like to start capturing these scripts in the codebase. Maybe we could create a new directory inside cpp/scripts specifically for these build analysis scripts? Something like cpp/scripts/stats? (I'm terrible at naming sometimes) |
Since we are specifying the same file for each compiler invocation we are hitting parallel clobber issues due to multiple compiler instances writing to the file at the same time. If you need a high level overview of what compile jobs have taken so long, and not where in nvcc it has occured. You can scrape the ninja build database for that information. libcudf is doing this already: https://github.com/rapidsai/cudf/blob/branch-23.04/build.sh#L303 |
I thought that could be an issue as well, but locally (12 cores) this has not been a problem. To rule it out, is there a way to instruct CMake to make the flag dependent on the output file, e.g., to get |
There is no easy way to do this. CMake doesn't offer any way to get the object location on a per source basis. You would need to do something like this: get_target_property(sources <target_name> SOURCES)
foreach(source IN LISTS sources)
cmake_path(GET source FILENAME source_name)
set_source_files_properties(${source} PROPERTIES COMPILE_FLAGS "--time=${source_name}.csv")
endforeach() |
@robertmaynard Thanks for the snippet! It works locally. Since the |
d2becd2
to
97d5f9e
Compare
@ahendriksen while it may seem cumbersome, I’ve been thinking about @robertmaynard’s previous suggestion and I actually think it could be useful to dump individual csv files for each of the source files that get compiled. Though, I wonder, should we stick them in a common directory (maybe something like cpp/build/cuda_time_logs?) I still like the idea of keeping it as an option in build.sh (which ultimately turns on an option in cmake) and I can’t imagine this would add all that much overhead to the build. What we can do is have the script read through all the csv files in the whatever we decide to name that directory. what do you think? |
@cjnolet, One thing that may not be immediately obvious from the suggestion of @robertmaynard is that:
In summary, with the change you are suggesting, there would still be some translation units that result in writes to the same csv file and we would have to collect all csv files in a directory (as you suggest) to reduce clutter. As such, I think the current design is a local optimum. |
Those are both solvable by computing the relative path of the source file from the root of the project and encoding it into the file name / path. |
I thought this was going to be an easy change. The scope is slowly getting out of hand. Maybe we can close the PR and leave it as is for documentation purposes. What do you think @cjnolet? |
@ahendriksen I'll leave that up to you. I do think it's a very useful change and it would really help debugging compile times. I agree with @robertmaynard, though, that we should probably capture these timings in individual files. It doesn't seem too hard to accomplish this by taking the path for each file and replacing the / for _. Maybe I'm oversimplifying this though? @robertmaynard is there an easy way to get a hold of the relative path (from the repo root) for each source file? @ahendriksen I would honestly be happy even if we just dumped the renamed files to a single directory for now. |
@cjnolet Here is how you can get the relative path for each source file and convert them into allowable file names: get_target_property(sources <TARGET> SOURCES)
foreach(source IN LISTS sources)
cmake_path(IS_ABSOLUTE source is_abs)
if(is_abs)
cmake_path(RELATIVE_PATH source BASE_DIRECTORY ${PROJECT_SOURCE_DIR}) # convert to relative path if not already one
endif()
string(MAKE_C_IDENTIFIER "${source}" filename) #convert to valid filename
set_source_files_properties(${source} PROPERTIES COMPILE_FLAGS "--time=${filename}.csv")
endforeach() |
This will create a csv file in the cpp/build directory that records the compilation of each translation unit and how long each phase of the nvcc compilation took. There does not seem to be a downside to enabling this, and it will be very helpful to diagnose build issues. To analyze the file, the following python code will help. It requires pandas, matplotlib, and seaborn: ------------------------------------------------------------ import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from pathlib import Path from matplotlib import colors df = pd.read_csv("./nvcc_compile_log.csv") df = df.rename(columns=str.strip) df["seconds"] = df["metric"] / 1000 df["file"] = df["source file name"] df["phase"] = df["phase name"].str.strip() def categorize_time(s): if s < 60: return "less than a minute" else: return "more than a minute" dfp = df.query("phase!='nvcc (driver)'").pivot("file", values="seconds", columns="phase") dfp_sum = dfp.sum(axis="columns") df_fraction = dfp.divide(dfp_sum, axis="index") df_fraction["total time"] = dfp_sum df_fraction = df_fraction.melt(ignore_index=False, id_vars="total time", var_name="phase", value_name="fraction") dfp["total time"] = dfp_sum df_absolute = dfp.melt(ignore_index=False, id_vars="total time", var_name="phase", value_name="seconds") df_fraction["time category"] = dfp["total time"].apply(categorize_time) df_absolute["time category"] = dfp["total time"].apply(categorize_time) palette = { "gcc (preprocessing 4)": colors.hsv_to_rgb((0, 1, 1)), 'cudafe++': colors.hsv_to_rgb((0, 1, .75)), 'gcc (compiling)': colors.hsv_to_rgb((0, 1, .4)), "gcc (preprocessing 1)": colors.hsv_to_rgb((.33, 1, 1)), 'cicc': colors.hsv_to_rgb((.33, 1, 0.75)), 'ptxas': colors.hsv_to_rgb((.33, 1, 0.4)), 'fatbinary': "grey", } sns.displot( df_absolute.sort_values("total time"), y="file", hue="phase", hue_order=reversed(["gcc (preprocessing 4)", 'cudafe++', 'gcc (compiling)', "gcc (preprocessing 1)", 'cicc', 'ptxas', 'fatbinary', ]), palette=palette, weights="seconds", multiple="stack", kind="hist", height=20, ) plt.xlabel("seconds"); plt.savefig('absolute_compile_times.png') sns.displot( df_fraction.sort_values('total time'), y="file", hue="phase", hue_order=reversed(["gcc (preprocessing 4)", 'cudafe++', 'gcc (compiling)', "gcc (preprocessing 1)", 'cicc', 'ptxas', 'fatbinary', ]), palette=palette, weights="fraction", multiple="stack", kind="hist", height=15, ) plt.xlabel("fraction"); plt.savefix("relative_compile_times.png")
Hopefully this prevents segfaults in CI.
The time option is disabled by default. When enabled, writes a log of compilation times to cpp/build/nvcc_compile_log.csv. This is not supported in CI, as it leads to seg faults.
f0d31b0
to
2aec639
Compare
Perhaps the file system outside the CMakeFiles/ directory is not writable.
I think it might be more difficult than we thought to get this working in CI. I have tried the suggestion writing to separate csv files, but the compilation continues to segfault in CI. A formatted example from the logs:
I thought it could have to do with long file names that commonly occur in conda builds, but the paths passed to nvcc are all reasonably short. If this cannot be made to work in CI, then maybe we should just document that |
@robertmaynard do you have any other ideas as to why this might still be causing a segfault in CI. Im actually a little surprised myself that it's still doing failing. |
@ahendriksen Im actually on board w/ keeping this feature as-is for now (outputting to individual files that we can analyze separately) and just not having it run in CI. At least the feature exists that we can run locally to profile the build and we can turn it back on in CI once we figure out what's causing the segault. What do you think? I'd love to have this in 23.04. |
That sounds good! I would prefer to revert the scattering of the output files though, as it does not fix CI. Previously, all the information was be centralized in a single csv file. This makes analysis way easier.
…________________________________
From: Corey J. Nolet ***@***.***>
Sent: Tuesday, March 28, 2023 6:19:42 PM
To: rapidsai/raft ***@***.***>
Cc: Allard Hendriksen ***@***.***>; Mention ***@***.***>
Subject: Re: [rapidsai/raft] build.sh: Add option to log nvcc compile times (PR #1262)
@ahendriksen<https://github.com/ahendriksen> Im actually on board w/ keeping this feature as-is for now (outputting to individual files that we can analyze separately) and just not having it run in CI. At least the feature exists that we can run locally to profile the build and we can turn it back on in CI once we figure out what's causing the segault. What do you think? I'd love to have this in 23.04.
—
Reply to this email directly, view it on GitHub<#1262 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AA72YFUZOU5CBO7Z5QOSBELW6MFR5ANCNFSM6AAAAAAUWSRJ5I>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
-----------------------------------------------------------------------------------
NVIDIA GmbH
Wuerselen
Amtsgericht Aachen
HRB 8361
Managing Directors: Michael Ching, Donald Robertson, Rebecca Peters and Ludwig von Reiche
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
|
@ahendriksen are you sure all the data is being written to the file in that case? I wouldn't normally expect multiple threads / processes to be able to independently write to the same file without some sort of locking mechanism. @robertmaynard do you know if this is the case with nvcc? |
@cjnolet : I haven't read the nvcc source code if that is what you mean. I just tested with |
I agree with @ahendriksen in testing this nvcc feature does seem to be multi process safe. It most likely does come at some performance cost as each processes takes an exclusive lock on the file. My only theory on why we are seeing segfaults in CI is that this tracking does incur some memory overhead which is causing a OOM segfault. |
@ahendriksen @robertmaynard sounds good to me. If we aren't losing any information, I do agree that it's easier for analysis just to keep everything in the same file. |
/merge |
Add
--time
option tobuild.sh
that enables compile time logging ofnvcc
.Also, add a script
cpp/scripts/analyze_nvcc_log.py
to find the translation units that take the longest time.Output looks like: