Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Support casting to and from utf8_view/binary_view #42247

Closed
Tracked by #43068
felipecrv opened this issue Jun 22, 2024 · 8 comments
Closed
Tracked by #43068

[C++] Support casting to and from utf8_view/binary_view #42247

felipecrv opened this issue Jun 22, 2024 · 8 comments

Comments

@felipecrv
Copy link
Contributor

Describe the enhancement requested

There are 3 categories of binary/string-like types in Arrow:

  • Span = utf8 | large_utf8 | binary | large_binary
  • Fixed = fixed_size_binary
  • View = utf8_view | binary_view

They can cast between themselves when a View is not involved:

Input Output CanCast
Span Fixed X
Fixed Span X
Span View
Fixed View
View Fixed
View Span

String views should behave similarly to strings when it comes to casts.

Component(s)

C++

@felipecrv
Copy link
Contributor Author

@llama90 FYI (I'm working on a PR).

@llama90
Copy link
Contributor

llama90 commented Jun 22, 2024

Thank you for letting me know. I really appreciate it.

@felipecrv
Copy link
Contributor Author

#43010 needs to be fixed before casts involving dictionary(_, utf8_view()|binary_view()) can be supported.

@felipecrv
Copy link
Contributor Author

@llama90 FYI (I'm working on a PR).

I opened a draft PR to make the claim that I'm working on it more believable :)

@llama90
Copy link
Contributor

llama90 commented Jul 17, 2024

@felipecrv Thank you. I will take a look!

@felipecrv
Copy link
Contributor Author

@felipecrv Thank you. I will take a look!

I made the PR green.

@llama90
Copy link
Contributor

llama90 commented Jul 21, 2024

I made the PR green.

Thanks!

pitrou pushed a commit that referenced this issue Sep 12, 2024
…3302)

### Rationale for this change

We need casts between string (binary) and string-view (binary-view) types since they are semantically equivalent.

### What changes are included in this PR?

 - Add `is_binary_view_like()` type predicate
 - Add `BinaryViewTypes()` list including `STRING_VIEW/BINARY_VIEW`
 - New cast kernels

### Are these changes tested?

Yes, but test coverage might be improved.

### Are there any user-facing changes?

More casts are available.
* GitHub Issue: #42247

Lead-authored-by: Felipe Oliveira Carvalho <[email protected]>
Co-authored-by: mwish <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
@pitrou pitrou added this to the 18.0.0 milestone Sep 12, 2024
@pitrou
Copy link
Member

pitrou commented Sep 12, 2024

Issue resolved by pull request 43302
#43302

@pitrou pitrou closed this as completed Sep 12, 2024
khwilson pushed a commit to khwilson/arrow that referenced this issue Sep 14, 2024
…ew (apache#43302)

### Rationale for this change

We need casts between string (binary) and string-view (binary-view) types since they are semantically equivalent.

### What changes are included in this PR?

 - Add `is_binary_view_like()` type predicate
 - Add `BinaryViewTypes()` list including `STRING_VIEW/BINARY_VIEW`
 - New cast kernels

### Are these changes tested?

Yes, but test coverage might be improved.

### Are there any user-facing changes?

More casts are available.
* GitHub Issue: apache#42247

Lead-authored-by: Felipe Oliveira Carvalho <[email protected]>
Co-authored-by: mwish <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants