Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add str.edit_distance_matrix #8463

Merged
merged 4 commits into from
Jun 10, 2021
Merged

Add str.edit_distance_matrix #8463

merged 4 commits into from
Jun 10, 2021

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Jun 8, 2021

This PR plumbs nvtext's edit_distance_matrix to cudf python with necessary precondition checks. It also adds python tests.

Closes #6341

@isVoid isVoid requested a review from a team as a code owner June 8, 2021 23:53
@isVoid isVoid requested review from marlenezw and skirui-source June 8, 2021 23:53
@isVoid isVoid self-assigned this Jun 8, 2021
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jun 8, 2021
@isVoid isVoid added non-breaking Non-breaking change Python Affects Python cuDF API. feature request New feature or request 3 - Ready for Review Ready for review by team and removed Python Affects Python cuDF API. labels Jun 8, 2021
@isVoid isVoid changed the title Adds str.edit_distance_matrix Add str.edit_distance_matrix Jun 9, 2021
@codecov
Copy link

codecov bot commented Jun 9, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@90e29d9). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 030c7c4 differs from pull request most recent head ce21709. Consider uploading reports for the commit ce21709 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #8463   +/-   ##
===============================================
  Coverage                ?   82.85%           
===============================================
  Files                   ?      109           
  Lines                   ?    17919           
  Branches                ?        0           
===============================================
  Hits                    ?    14846           
  Misses                  ?     3073           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90e29d9...ce21709. Read the comment docs.

@@ -787,6 +787,34 @@ def test_edit_distance():
assert_eq(expected, actual)


def test_edit_distance_matrix():
# normal
sr = cudf.Series(["rounded", "bounded", "bounce", "trounce", "ounce"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering does this implementation also work for series containing lists of strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. It only accept string type series.

Copy link
Contributor

@marlenezw marlenezw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work as usual Michael. Generally looks good to me!

python/cudf/cudf/core/column/string.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_text.py Show resolved Hide resolved
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jun 10, 2021
@galipremsagar
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit b895396 into rapidsai:branch-21.08 Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Plumb edit_distance_matrix to Python
4 participants