Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory error due to lambda return type deduction limitation #9778

Merged
merged 5 commits into from
Dec 1, 2021

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Nov 25, 2021

Fixes #9703
replace device lambda with device functor with return type. (due to 14. extended-lambda-restrictions )
add __host__ to lambda for nvcc return type deduction to work properly.
replaced auto (generic lambda) with size_type.
fixes shared memory write error caused in #9703

Fixes #9703
add __host__ for nvcc return type deduction to work. 
replaced `auto` (generic lambda) with size_type.
@karthikeyann karthikeyann added bug Something isn't working 3 - Ready for Review Ready for review by team cuda libcudf Affects libcudf (C++/CUDA) code. 4 - Needs Review Waiting for reviewer to review or respond non-breaking Non-breaking change labels Nov 25, 2021
@karthikeyann karthikeyann requested a review from a team as a code owner November 25, 2021 20:28
@codecov
Copy link

codecov bot commented Nov 25, 2021

Codecov Report

Merging #9778 (eb166db) into branch-22.02 (967a333) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.02    #9778      +/-   ##
================================================
- Coverage         10.49%   10.47%   -0.02%     
================================================
  Files               119      119              
  Lines             20305    20368      +63     
================================================
+ Hits               2130     2133       +3     
- Misses            18175    18235      +60     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <ø> (ø)
python/cudf/cudf/core/indexed_frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/groupby/groupby.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d1811b5...eb166db. Read the comment docs.

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a gtest equivalent to this?#9703 (comment)

This way, the test will fail if someone inadvertently removes the __host__ in the future.

cpp/src/sort/rank.cu Outdated Show resolved Hide resolved
replace host device lambda with device functor due to device lambda return type deduction limitation
@karthikeyann karthikeyann changed the title add __host__ for nvcc return type deduction Fix memory error due to lambda return type deduction limitation Nov 30, 2021
@karthikeyann
Copy link
Contributor Author

rerun tests

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'd still like to see a comment in the code about this. Also a gtest to prevent someone trying to turn this back into a lambda.

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the following gtest to the rank_test.cpp so this will fail if the functor ever gets converted back to a lambda?

struct RankLarge : public BaseFixture {
};

TEST_F(RankLarge, test_large)
{
  auto iter = thrust::counting_iterator<int64_t>(0);
  fixed_width_column_wrapper<int64_t> col1(iter, iter + 10558);
  auto result =
    cudf::rank(col1, rank_method::AVERAGE, {}, null_policy::EXCLUDE, null_order::AFTER, false);
  fixed_width_column_wrapper<double, int> expected(iter + 1, iter + 10559);
  CUDF_TEST_EXPECT_COLUMNS_EQUAL(result->view(), expected);
}

I was not able to get 1558 to fail on my GPU but 10558 did.
I verified your fix here works for my GPU with this.

@karthikeyann
Copy link
Contributor Author

Thank you @davidwendt
size 1538 has share memory error. it could be caught by compute-sanitizer, but it does not crash or fail the test. Bigger size ~2500 is enough to cause test to throw error.

cpp/src/sort/rank.cu Outdated Show resolved Hide resolved
Co-authored-by: David Wendt <[email protected]>
@karthikeyann
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 5491cc7 into branch-22.02 Dec 1, 2021
@karthikeyann karthikeyann deleted the karthikeyann-patch-3 branch May 14, 2022 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cudaErrorIllegalAddress while computing ranking...
5 participants