-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix like regex escaping #1085
Fix like regex escaping #1085
Conversation
@@ -303,7 +303,7 @@ where | |||
/// use arrow::compute::like_utf8; | |||
/// | |||
/// let strings = StringArray::from(vec!["Arrow", "Arrow", "Arrow", "Ar"]); | |||
/// let patterns = StringArray::from(vec!["A%", "B%", "A.", "A."]); | |||
/// let patterns = StringArray::from(vec!["A%", "B%", "A.", "A_"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol - even the doctest here was relying on a non-escaped .
Codecov Report
@@ Coverage Diff @@
## master #1085 +/- ##
=======================================
Coverage 82.28% 82.28%
=======================================
Files 168 168
Lines 49281 49289 +8
=======================================
+ Hits 40549 40559 +10
+ Misses 8732 8730 -2
Continue to review full report at Codecov.
|
thanks for the PR! looks good! |
let pat = right.value(i); | ||
let re = if let Some(ref regex) = map.get(pat) { | ||
let pat = escape(right.value(i)); | ||
let re = if let Some(ref regex) = map.get(&pat) { | ||
regex | ||
} else { | ||
let re_pattern = pat.replace("%", ".*").replace("_", "."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this pat.replace
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, to make a proper regex out of it (instead of literally matching on the percentage or underscore character).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we could minimize some borrow / cloning by moving the escape
... So now it looks more similar to the other places.
regex | ||
} else { | ||
let re_pattern = pat.replace("%", ".*").replace("_", "."); | ||
let re_pattern = escape(pat).replace("%", ".*").replace("_", "."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less code and fewer bugs -- now that is the sign of a great fix ❤️
Nice work -- looks great to me @Dandandan .
We can file a follow on ticket, perhaps. If someone can whip up a reproducer that would be awesome 🤞 |
Filed #1087 to track the possible escaping problems |
* Fix like regex escaping * Fix like regex escaping * Fix doctest * Simplify
* Fix like regex escaping * Fix like regex escaping * Fix doctest * Simplify Co-authored-by: Daniël Heres <[email protected]>
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
It was broken properly, with a bug in test, after commit c296882 Can drop this after rebase on commit e8cc39e "Fix like regex escaping (apache#1085)", first released in 7.0.0
Which issue does this PR close?
Closes #1069
Rationale for this change
Fixes
like
not to handle regex syntax like.*
and.
but escapes it.What changes are included in this PR?
Are there any user-facing changes?