Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Series.hash_encode. #9457

Merged
merged 3 commits into from
Oct 19, 2021

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Oct 16, 2021

Resolves #9381 by deprecating Series.hash_encode. See issue for details.

@bdice bdice added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change deprecation labels Oct 16, 2021
@bdice bdice requested a review from a team as a code owner October 16, 2021 01:17
@bdice bdice requested review from isVoid and rgsl888prabhu October 16, 2021 01:17
@bdice bdice self-assigned this Oct 16, 2021
@codecov
Copy link

codecov bot commented Oct 16, 2021

Codecov Report

Merging #9457 (6d7aa26) into branch-21.12 (ab4bfaa) will decrease coverage by 0.12%.
The diff coverage is n/a.

❗ Current head 6d7aa26 differs from pull request most recent head 7de125c. Consider uploading reports for the commit 7de125c to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9457      +/-   ##
================================================
- Coverage         10.79%   10.66%   -0.13%     
================================================
  Files               116      117       +1     
  Lines             18869    19753     +884     
================================================
+ Hits               2036     2106      +70     
- Misses            16833    17647     +814     
Impacted Files Coverage Δ
python/cudf/cudf/io/csv.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/hdf.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/orc.py 0.00% <0.00%> (ø)
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_version.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/abc.py 0.00% <0.00%> (ø)
python/cudf/cudf/api/types.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/dlpack.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
... and 63 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b66f14e...7de125c. Read the comment docs.

@bdice
Copy link
Contributor Author

bdice commented Oct 19, 2021

rerun tests

@bdice
Copy link
Contributor Author

bdice commented Oct 19, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4e04334 into rapidsai:branch-21.12 Oct 19, 2021
@bdice bdice deleted the deprecate-series-hash_encode branch October 19, 2021 18:55
rapids-bot bot pushed a commit that referenced this pull request Oct 19, 2021
…9458)

This PR implements `DataFrame.hash_values`, which will replace `DataFrame.hash_columns` (which is deprecated in this PR). This proposal was discussed offline with @vyasr and in the weekly cuDF Python dev meeting.

This unifies the method name and signature for `Series.hash_values` and `DataFrame.hash_values`, enabling future internal refactoring by moving the method's implementation to the `Frame` class (though I'm waiting for the removal of `Series.hash_encode` to follow up on this so it can be done in a single pass, see #9381 and #9457).

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Sheilah Kirui (https://github.com/skirui-source)
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)

URL: #9458
rapids-bot bot pushed a commit that referenced this pull request Dec 23, 2021
This PR removes the deprecated method `Series.hash_encode`. Resolves #9475. Follows up on #9457, #9381.

This PR also removes libcudf code paths used solely for this Python method.

Users may replace code like `series.hash_encode(stop, use_name=False)` with `series.hash_values(method="murmur3") % stop`.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)
  - Conor Hoekstra (https://github.com/codereport)

URL: #9942
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[PROPOSAL] Deprecate Series.hash_encode
3 participants