-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile times are growing significantly #581
Comments
Do the clang tools work with nvcc? Much of our template time is in nvcc. You can use |
I'm not sure. You can technically get Clang to compile device code, so that may be a path worth exploring using Clang + these tools. |
Any updates on this? I'd love to use precompiled headers with CUDA projects. |
Compile time continues to grow, but that is largely because our supported feature set and supported types continue to grow. In 0.15 we are aiming to add at least 10 new types (4 unsigned int types, 4 timestamp types, list column type, decimal fixed-point type). Naturally this will increase compile time and binary size. Meanwhile, in 0.14 we dropped all of the legacy APIs that were previously deprecated, which reduced compile time a bit, and significantly reduced binary size. There have been and will continue to be various efforts to reduce compile time of certain components. We are investigating possibly splitting libcudf into multiple libraries. We have not discussed precompiled headers. |
@jrhemstad @harrism , is this still a relevant issue? |
Our compile time is worse than ever, so I guess its still relevant. We could benefit from someone putting in a concerted effort to eliminate unnecessary includes across the library. |
Out of curiosity I gave this a quick shot. (Unsurprisingly) clang does not currently support the experimental CUDA features that we have enabled ( |
Pretty sure clang supports those features natively without the need for any extra compile flags. I'm guessing the error was caused by clang not recognizing those flags. |
You're right, it does. I removed those and made some progress, but not nearly enough for a working build with clang yet. Here's a list of necessary changes so far:
At this point I start seeing failures like this:
and
I need to track this down a bit further, but it looks like some aspect of how thrust SFINAEs different code paths isn't supported in clang device code yet either. |
I tried to build cuco with clang about a year ago and was blocked by its dependencies like thrust or libcudacxx that cannot be built with clang. To find how much effort is required to build device code with clang, I would suggest starting with a smaller library like cuco and see how it goes from there. Related issues: |
Well then... looks like we've got to work our way all the way up the stack for this. For the purpose of something like clang-tidy we might be able to get some partial results based on the discussion in rapidsai/raft#424, but that's probably only partial support at best and I don't know if that will work with the other tools of interest like IWYU. |
Compile times are an ever-present problem for us. This issue as currently framed isn't clearly actionable, so let's lay out some concrete points.
#379 implemented the
Since #9631 we have been tracking build times in CI. We monitor this and keep an eye on TUs that are particularly slow to compile. Where necessary, we have reacted to slow compilation by reorganizing the code and explicitly instantiating templates.
This seems like the main action item remaining. As discussed above, |
Feature request
As anyone who has built libgdf recently has surely noticed, the time to compile the library from scratch has grown significantly in the last few months. For example, compiling on all 10 cores of my i9-7900X @ 3.30GHz takes 11 minutes as reported by
time make -j
.Compiling with
-ftime-report
may be a good place to start to see where all the compilation time is being spent.This is likely due to the large amount of template instantiation that is required for instantiating functions for all possible types. We should make sure that best practices are being followed in template instantiation such that a template for a given type is only having to be instantiated once via explicit instantiation.
Much of our code is implemented in headers, which causes it to be recompiled everywhere that header is included. Using pre-compiled headers may help:
https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
http://itscompiling.eu/2017/01/12/precompiled-headers-cpp-compilation/
Furthermore, code should be scrubbed from excessive and unnecessary
#include
s. Compiling with-MD
will show what files are being includedHere's a Clang tool that ensures you only include the necessary headers: https://github.com/include-what-you-use/include-what-you-use
Here's a Clang tool to profile time spent in template instantiation: https://github.com/mikael-s-persson/templight
The text was updated successfully, but these errors were encountered: