-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BLAS:: SYCL-BLAS backend] Enable sycl blas routines #277
[BLAS:: SYCL-BLAS backend] Enable sycl blas routines #277
Conversation
a20f662
to
925c898
Compare
4d39eab
to
d334cd8
Compare
d334cd8
to
79bfe76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you!
Thanks @mkrainiuk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review. This looks fine to me, can you test the newly supported APIs on nvidia hardware to make sure that configuration is working before we merge?
No problem! Here are the test logs for the NVIDIA gpu, NVIDIA_GPU_test_results.txt. Thanks for your reviews, @mkrainiuk and @andrewtbarker. We are planning to enable more SYCL-BLAS routines for the future. This will involve other PRs very similar to this, i.e., each changed line will add a routine. In this regard, we are wondering what could be an ideal PR size that could ease your reviewing tasks. Do you prefer fewer PRs containing more routines, or more PRs containing fewer routines? |
Speaking only for myself, for a PR like this I think it makes sense to include many routines in few PRs. I like to run the tests locally for confirmation and that will be easier to do this way. If you were changing logic / algorithms it would be a different story, then smaller PRs would likely be better. |
Same here, thanks! |
I was thinking the same. Thanks for your feedbacks, @mkrainiuk and @andrewtbarker. |
Added rotmg, sbmv, tbmv, spmv, tbsv, trsv and tpmv BLAS operators.
Description
This PR enables the following routines in the SYCL-BLAS backend: sbmv, tbmv, spmv, tpmv, tbsv, trsv and rotmg.
This patch depends on #262 and #264.
Checklist
All Submissions
[*] Do all unit tests pass locally? Logs: test_results.txt
Failing ones are due to
requiring fp64 support: Intel(R) UHD Graphics 770 [0x4680] is not supported
.[*] Have you formatted the code using clang-format?
New features