-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements in matrix::gather
: test coverage, compilation errors, performance
#1126
Conversation
Codecov ReportBase: 87.99% // Head: 87.99% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #1126 +/- ##
=============================================
Coverage 87.99% 87.99%
=============================================
Files 21 21
Lines 483 483
=============================================
Hits 425 425
Misses 58 58 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Louis, thanks for the PR, it looks good! While we are here, could you improve a little bit the docstring of gather, I have left a few questions.
I'm not sure what's up with CI but it doesn't seem to be a compiler nor test error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Louis for updating the doc. The PR looks good to me.
@Nyrio the changes look good here. Before we merge this, can you verify that the changes to the template params don't break cuml downstream? |
@cjnolet There is one call to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
…performance (rapidsai#1126) In order to deprecate `copy_selected` from `ann_utils.cuh`, I wanted to make sure that the performance of `matrix::gather` was on par. But in the process I discovered that: - Map transforms and conditional copy were not tested at all. - In fact, most of the API in `gather.cuh` wasn't covered in tests and some of the functions didn't even compile. - The same type `MatrixIteratorT` was used for the input and output iterators, which made it impossible to take advantage of custom iterators, as is needed in `kmeans_balanced` to convert the dataset from `T` to `float` and gather in a single step. - The performance was really poor when `D` is small because the kernel assigns one block per row (so a block could be working on only 2 or 3 elements...) This PR addresses all the aforementioned issues. Authors: - Louis Sugy (https://github.com/Nyrio) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#1126
In order to deprecate
copy_selected
fromann_utils.cuh
, I wanted to make sure that the performance ofmatrix::gather
was on par. But in the process I discovered that:gather.cuh
wasn't covered in tests and some of the functions didn't even compile.MatrixIteratorT
was used for the input and output iterators, which made it impossible to take advantage of custom iterators, as is needed inkmeans_balanced
to convert the dataset fromT
tofloat
and gather in a single step.D
is small because the kernel assigns one block per row (so a block could be working on only 2 or 3 elements...)This PR addresses all the aforementioned issues.