-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Speed up compiling]: reduce the NVCC compiling (some .cu operators can be compiled by G++) #5491
Comments
The experiment seems so great!
How to find which op should be compiled by G++, but another should be compiled by nvcc?
Is the .cu which doesn't contain |
The compile time above seems to be in Debug mode. Do you have compile time in Release mode? |
|
@chengduoZH The compiling uses the default flags
|
Even though CUDA keywords in the .cu file, such as __device__, __global__, dim3 may not be used in the .cu file, they could be used in the header files from eigen. It seems that many of our operators are in this category (e.g, elementwise_mul_op.cu). Is there any easier way to figure this out? |
Compiling time comparison between NVCC and G++
Conclusion:
gencode: sm_xx
), more slower NVCC compilingExperiment 1: elementwise_mul_op, this op uses Eigen to compute
G++
NVCC
gencode: only sm_35
Experiment 2: mul_op, this op uses
math::matmul
to compute.The .cu operators which can be compiled by G++
Following
.cu
operators can be compiled by G++, since some the dependent CUDA kernels have been compiled in math libraries (paddle/operator/math/
file). And thecuDNN
can also be compiled by G++.But different compiling rules for different operators are a little confused for developers.
Also ralated to #5413
The text was updated successfully, but these errors were encountered: