Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix union operation in _is_supported() #7959

Merged
merged 2 commits into from
Apr 17, 2021

Conversation

charlesbluca
Copy link
Member

Changes the _global_set union operation happening in _is_supported() to

_global_set = _global_set.union(set(arg[col]))

Since set.union() doesn't actually modify the set in place. Before this PR, passing something like {"a": ["unsupported_agg"]} into _is_supported() would always return True.

cc @rjzamora

@charlesbluca charlesbluca requested a review from a team as a code owner April 14, 2021 20:05
@github-actions github-actions bot added the Python Affects Python cuDF API. label Apr 14, 2021
@rjzamora rjzamora added non-breaking Non-breaking change dask Dask issue labels Apr 14, 2021
Copy link
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @charlesbluca ! Looks good.

I would normally ask you to add a test that includes a groupby test that is not supported by the optimized code path. However, it looks like any aggregation that is not supported by that path is broken in upstream dask anyway...

@charlesbluca
Copy link
Member Author

Thanks @rjzamora! Would a good compromise be to add some tests for _is_supported() alone to make sure it returns what we're expecting?

@rjzamora
Copy link
Member

Thanks @rjzamora! Would a good compromise be to add some tests for _is_supported() alone to make sure it returns what we're expecting?

Yeah - Although I'd rather not add a (possibly fragile) test for a private function, it's probably better than no test coverage at all. Maybe just test that including a completely made-up aggregation name will return False :)

@charlesbluca
Copy link
Member Author

Sure! I added a simple test for returning False. If needed, I can extend it / add another to check for True.

@charlesbluca
Copy link
Member Author

Also I think I need a category label for the rest of the checks to run

@rjzamora rjzamora added the 3 - Ready for Review Ready for review by team label Apr 14, 2021
@kkraus14 kkraus14 added the bug Something isn't working label Apr 14, 2021
@kkraus14
Copy link
Collaborator

Also I think I need a category label for the rest of the checks to run

I think gpuci is just currently down due to a network outage.

@charlesbluca
Copy link
Member Author

rerun tests

@charlesbluca
Copy link
Member Author

Is gpuCI up and running for cuDF again? Reran the tests yesterday and looks like they might have timed out

@kkraus14
Copy link
Collaborator

rerun tests

@codecov
Copy link

codecov bot commented Apr 16, 2021

Codecov Report

Merging #7959 (51560eb) into branch-0.20 (599f62d) will increase coverage by 0.19%.
The diff coverage is 91.71%.

❗ Current head 51560eb differs from pull request most recent head 71f3b26. Consider uploading reports for the commit 71f3b26 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #7959      +/-   ##
===============================================
+ Coverage        82.30%   82.49%   +0.19%     
===============================================
  Files              101      103       +2     
  Lines            17053    17306     +253     
===============================================
+ Hits             14035    14277     +242     
- Misses            3018     3029      +11     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/utils/cudautils.py 55.04% <ø> (+4.65%) ⬆️
python/cudf/cudf/utils/dtypes.py 83.44% <ø> (-6.45%) ⬇️
python/cudf/cudf/utils/utils.py 89.44% <ø> (+4.37%) ⬆️
python/cudf/cudf/core/column/numerical.py 94.41% <78.57%> (-0.61%) ⬇️
python/cudf/cudf/core/column/lists.py 86.95% <80.00%> (-0.27%) ⬇️
python/cudf/cudf/core/groupby/groupby.py 91.27% <80.95%> (-2.18%) ⬇️
python/cudf/cudf/core/column/column.py 88.48% <85.36%> (+1.04%) ⬆️
python/cudf/cudf/core/column/struct.py 96.15% <86.66%> (-3.85%) ⬇️
python/cudf/cudf/core/index.py 92.62% <89.13%> (-0.40%) ⬇️
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d1c3245...71f3b26. Read the comment docs.

@kkraus14
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4da38a6 into rapidsai:branch-0.20 Apr 17, 2021
@charlesbluca charlesbluca deleted the fix-is-supported branch August 3, 2021 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working dask Dask issue non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants