-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add binary support in arrow-string #6926
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry in the delay reviewing this PR -- it is hard to find time reviewing such a large PR
I wonder what the usecase is for using LIKE on binary data? I as because it seems to me that LIKE is mostly useful for character strings.
I can see the usecase for starts_with
/ ends_with
and contains
for binary data,
Perhaps instead of trying to inject binary array into the code for handling strings, we could simply have simpler prefix/suffix matching for binary -- it might have some more repetition but would be simpler to understand any avoid any potential performance issues related to this code 🤔
@@ -59,6 +59,16 @@ pub struct FixedSizeBinaryArray { | |||
} | |||
|
|||
impl FixedSizeBinaryArray { | |||
/// Returns true if all data within this array is ASCII | |||
pub fn is_ascii(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the need to check a binary array for ASCII -- there shouldn't be any optimizations that rely on the data being ASCII
(ignore branch name)
Which issue does this PR close?
Closes #6923
What changes are included in this PR?
PredicateImpl
trait to work with the predicate regardless of string or binaryPredicateImpl
for the oldPredicate
and the newBinaryPredicate
using macro (I don't really like this as it seem less maintainable, but not sure what's better, duplicating or macro, or another approach)Are there any user-facing changes?
Yes, allow users to pass binary arrays to like/starts with/contains and more