Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Remove **kwargs from string/categorical methods #6750

Merged
merged 21 commits into from
Dec 8, 2020

Conversation

shwina
Copy link
Contributor

@shwina shwina commented Nov 12, 2020

This PR removes **kwargs from the string/categorical accessors where unnecessary, and exposes keyword arguments like inplace to the user directly.

If we want to maintain parity with Pandas APIs for Dask/others using cuDF internally, we can consider using the approach described in #6135, which will automatically raise NotimplementedError when unsupported kwargs are passed.

@shwina shwina requested a review from a team as a code owner November 12, 2020 18:06
@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@shwina shwina marked this pull request as draft November 12, 2020 18:10
@codecov
Copy link

codecov bot commented Nov 12, 2020

Codecov Report

Merging #6750 (4e08bea) into branch-0.18 (f6b16ab) will increase coverage by 0.44%.
The diff coverage is 100.00%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #6750      +/-   ##
===============================================
+ Coverage        81.59%   82.03%   +0.44%     
===============================================
  Files               96       96              
  Lines            15927    16262     +335     
===============================================
+ Hits             12996    13341     +345     
+ Misses            2931     2921      -10     
Impacted Files Coverage Δ
python/cudf/cudf/core/dataframe.py 91.08% <ø> (+0.12%) ⬆️
python/cudf/cudf/core/column/categorical.py 93.33% <100.00%> (+0.28%) ⬆️
python/cudf/cudf/core/column/column.py 87.92% <100.00%> (+0.10%) ⬆️
python/cudf/cudf/core/column/datetime.py 88.88% <100.00%> (+0.59%) ⬆️
python/cudf/cudf/core/column/methods.py 96.00% <100.00%> (-0.43%) ⬇️
python/cudf/cudf/core/column/numerical.py 94.53% <100.00%> (ø)
python/cudf/cudf/core/column/string.py 86.58% <100.00%> (+0.48%) ⬆️
python/cudf/cudf/core/column/timedelta.py 89.53% <100.00%> (+0.31%) ⬆️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
... and 43 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f6b16ab...4e08bea. Read the comment docs.

@shwina shwina changed the title [WIP] Remove **kwargs from string/categorical methods [REVIEW] Remove **kwargs from string/categorical methods Nov 17, 2020
@shwina shwina marked this pull request as ready for review November 17, 2020 04:16
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes LGTM ! 🎉 🎉

Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general comments and doubts. Otherwise lgtm!

Comment on lines -312 to +311
new_categories = new_categories.astype(common_dtype, copy=False)
old_categories = old_categories.astype(common_dtype, copy=False)
new_categories = new_categories.astype(common_dtype)
old_categories = old_categories.astype(common_dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the explicit copy parameter here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that parameter ends up never getting used. These lines call ColumnBase.astype(), which in turn calls as_categorical_column. I didn't find the use of copy in either of those

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: Series.astype() will continue to need the copy parameter: https://pandas.pydata.org/docs/reference/api/pandas.Series.astype.html

python/cudf/cudf/core/column/string.py Outdated Show resolved Hide resolved
@shwina shwina changed the base branch from branch-0.17 to branch-0.18 December 4, 2020 17:51
@shwina
Copy link
Contributor Author

shwina commented Dec 4, 2020

@kkraus14 Pushed to 0.18 as this is a QOL thing and more important for typing.

@kkraus14 kkraus14 added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. labels Dec 4, 2020
python/cudf/cudf/core/column/string.py Outdated Show resolved Hide resolved
@kkraus14
Copy link
Collaborator

kkraus14 commented Dec 4, 2020

@kkraus14 kkraus14 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change 5 - Ready to Merge Testing and reviews complete, ready to merge 6 - Okay to Auto-Merge and removed 3 - Ready for Review Ready for review by team labels Dec 4, 2020
CHANGELOG.md Outdated Show resolved Hide resolved
@rapids-bot rapids-bot bot merged commit 8a1a6d7 into rapidsai:branch-0.18 Dec 8, 2020
rapids-bot bot pushed a commit to rapidsai/cuml that referenced this pull request Dec 11, 2020
Remove keyword "stops" from call to cudf.core.column.string.slice, which no longer accepts arbitrary keywords.

cuDF change introduced in rapidsai/cudf#6750.

Authors:
  - William Hicks <[email protected]>

Approvers:
  - John Zedlewski
  - Micka

URL: #3289
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants