-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory error due to lambda return type deduction limitation #9778
Conversation
Fixes #9703 add __host__ for nvcc return type deduction to work. replaced `auto` (generic lambda) with size_type.
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #9778 +/- ##
================================================
- Coverage 10.49% 10.47% -0.02%
================================================
Files 119 119
Lines 20305 20368 +63
================================================
+ Hits 2130 2133 +3
- Misses 18175 18235 +60
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a gtest equivalent to this?#9703 (comment)
This way, the test will fail if someone inadvertently removes the __host__
in the future.
replace host device lambda with device functor due to device lambda return type deduction limitation
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I'd still like to see a comment in the code about this. Also a gtest to prevent someone trying to turn this back into a lambda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the following gtest to the rank_test.cpp
so this will fail if the functor ever gets converted back to a lambda?
struct RankLarge : public BaseFixture {
};
TEST_F(RankLarge, test_large)
{
auto iter = thrust::counting_iterator<int64_t>(0);
fixed_width_column_wrapper<int64_t> col1(iter, iter + 10558);
auto result =
cudf::rank(col1, rank_method::AVERAGE, {}, null_policy::EXCLUDE, null_order::AFTER, false);
fixed_width_column_wrapper<double, int> expected(iter + 1, iter + 10559);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(result->view(), expected);
}
I was not able to get 1558
to fail on my GPU but 10558
did.
I verified your fix here works for my GPU with this.
Thank you @davidwendt |
Co-authored-by: David Wendt <[email protected]>
@gpucibot merge |
Fixes #9703
replace device lambda with device functor with return type. (due to 14. extended-lambda-restrictions )
add__host__
to lambda for nvcc return type deduction to work properly.replacedauto
(generic lambda) withsize_type
.fixes shared memory write error caused in #9703