-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the performance of SparseMatrixBuilder on the CUDA arch is significantly lower than on the CPU arch #7080
Comments
@FantasyVR Yes I solved this by building from taichi source of exactly the commit of the PR you mentioned. Firstly I found the released 1.3.0 version in Pypi(commit tag: rc-v1.3.0, commit id: 0f25b95) not have the commit of the PR(commit id: 8413bc2):
So I tried to build from source of the master branch, which contains the PR. But the build version of taichi could not run the script, the log shows:
I think there maybe another bug but I don't dig into this... Then I check out to the exactly commit 8413bc2 and build from source again, this time it runned correctly:
And I have another question: Does the offline cache really work? I rerun the program(for several times) and it costs:
the comparison:
The offline_cache is set to be True, and I noticed the description in the doc: "offline_cache: Enable/disable offline cache of the compiled kernels", but the value of sparse matrix is randomly setted after exceute. |
Hi @tmxklzp, if the |
@FantasyVR Okay I got it. Thank you for helping me! |
Describe the bug
the build method of SparseMatrixBuilder takes a long time on the CUDA arch, but looks good on the CPU arch
To Reproduce
Log/Screenshots
The full log of the program:
if I changed arch = ti.cuda to arch = ti.cpu:
the build time on CUDA is significantly longer and the solver compute time is also longer than on CPU:
The text was updated successfully, but these errors were encountered: