Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a String column from UTF8 String byte arrays [skip ci] #8257

Merged
merged 5 commits into from
May 20, 2021

Conversation

firestarman
Copy link
Contributor

This PR is to support creating a ColumnVector from the byte arrays of UTF8 Strings.

And also let the Struct children creation support UTF8 Strings.

Closes #8137

Signed-off-by: Firestarman [email protected]

@firestarman firestarman requested a review from a team as a code owner May 17, 2021 02:21
@github-actions github-actions bot added the Java Affects Java cuDF API. label May 17, 2021
@firestarman firestarman added feature request New feature or request non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS labels May 17, 2021
@firestarman
Copy link
Contributor Author

rerun tests

Copy link
Member

@wjxiz1992 wjxiz1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the extra check but otherwise lgtm.

Signed-off-by: Firestarman <[email protected]>
@firestarman firestarman added the 5 - Ready to Merge Testing and reviews complete, ready to merge label May 20, 2021
@firestarman
Copy link
Contributor Author

@gpucibot merge

@firestarman firestarman self-assigned this May 20, 2021
@firestarman firestarman requested a review from sperlingxx May 20, 2021 01:51
@firestarman
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 2da8473 into rapidsai:branch-21.06 May 20, 2021
@firestarman firestarman deleted the utf8-string branch May 20, 2021 01:52
rapids-bot bot pushed a commit that referenced this pull request May 20, 2021
This is a small PR to support creating a scalar from an array of utf8 bytes.

Since the PR #8257 added the support for ColumnVector creation, so I think we'd better add it for scalar creation to avoid conversions between utf8 strings and Java strings when used in Spark.

Signed-off-by: Firestarman <[email protected]>

Authors:
  - Liangcai Li (https://github.com/firestarman)

Approvers:
  - Bobby Wang (https://github.com/wbo4958)

URL: #8294
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] JNI: Creating a string column from arrays of UTF8 bytes
5 participants