Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA performance regression of n-body after LLVM15 upgrade #6957

Closed
turbo0628 opened this issue Dec 22, 2022 · 2 comments · Fixed by #7012
Closed

CUDA performance regression of n-body after LLVM15 upgrade #6957

turbo0628 opened this issue Dec 22, 2022 · 2 comments · Fixed by #7012
Assignees

Comments

@turbo0628
Copy link
Member

Charts:

Compare with previous releases
image

Perfmon records
image

@turbo0628
Copy link
Member Author

turbo0628 commented Dec 29, 2022

After checking the nvptx code, the most obvious thing is that ftz option is not working for LLVM15

Do we need to change the options here?

@turbo0628
Copy link
Member Author

turbo0628 commented Dec 29, 2022

Also, the rsqrt is replaced by slower sqrt + rn combo:

image

need to fix this.

Update: We have no change in the rsqrt implementation. The problem is LLVM15 refuses to risk data precision with the rsqrt.approx instruction. Miss some llvm math options?

@feisuzhu feisuzhu moved this from Untriaged to In Progress in Taichi Lang Dec 30, 2022
bobcao3 pushed a commit that referenced this issue Dec 30, 2022
@github-project-automation github-project-automation bot moved this from In Progress to Done in Taichi Lang Dec 30, 2022
feisuzhu pushed a commit to feisuzhu/taichi that referenced this issue Jan 5, 2023
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant