Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JNI for cudf::drop_duplicates #9841

Merged
merged 8 commits into from
Dec 3, 2021

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Dec 3, 2021

This adds Java binding for cudf::drop_duplicates.

Note that when choosing which duplicate element to keep, only KEEP_FIRST or KEEP_LAST option can be selected. In other words, this does not support KEEP_NONE to remove all duplicate elements.

Closes #9115.

@ttnghia ttnghia added feature request New feature or request 3 - Ready for Review Ready for review by team Java Affects Java cuDF API. Spark Functionality that helps Spark RAPIDS non-breaking Non-breaking change labels Dec 3, 2021
@ttnghia ttnghia requested a review from jlowe December 3, 2021 19:36
@ttnghia ttnghia requested a review from a team as a code owner December 3, 2021 19:36
@ttnghia ttnghia self-assigned this Dec 3, 2021
@rapidsai rapidsai deleted a comment from codecov bot Dec 3, 2021
@codecov
Copy link

codecov bot commented Dec 3, 2021

Codecov Report

Merging #9841 (83ed4bf) into branch-22.02 (967a333) will decrease coverage by 0.04%.
The diff coverage is 5.79%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.02    #9841      +/-   ##
================================================
- Coverage         10.49%   10.44%   -0.05%     
================================================
  Files               119      119              
  Lines             20305    20422     +117     
================================================
+ Hits               2130     2133       +3     
- Misses            18175    18289     +114     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <ø> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/groupby/groupby.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <ø> (ø)
python/cudf/cudf/core/indexed_frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/json.py 0.00% <ø> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ac8aac...83ed4bf. Read the comment docs.

@ttnghia
Copy link
Contributor Author

ttnghia commented Dec 3, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit fdd9bb0 into rapidsai:branch-22.02 Dec 3, 2021
@ttnghia ttnghia deleted the jni_drop_duplicates branch December 3, 2021 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Java bindings for drop_duplicates
2 participants