-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LIKE operator support #241
Conversation
Codecov Report
@@ Coverage Diff @@
## master #241 +/- ##
==========================================
+ Coverage 61.34% 61.47% +0.13%
==========================================
Files 68 70 +2
Lines 6327 6429 +102
==========================================
+ Hits 3881 3952 +71
- Misses 1933 1960 +27
- Partials 513 517 +4
Continue to review full report at Codecov.
|
use glob matcher(thanks to @tie for implementing it) instead.
// Case 4. | ||
var r rune | ||
r, s = readRune(s) | ||
if !equalFold(p, r) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use direct comparison?
I ran this query on different DBs.
SELECT 'abc' LIKE 'ABC';
Results:
- PostgreSQL 12.3 returns false.
- SQLite 3.27.2 returns 1 (true).
- MySQL 5.7.12 returns 1 (true).
- Oracle Database 11g returns false (Query:
SELECT * FROM DUAL WHERE 'abc' like 'ABC'
, should be non-empty if true).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick with what MySQL and SQLite do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, ideally we should be comparing grapheme clusters using e.g. github.com/clipperhouse/uax29/graphemes and golang.org/x/text/collate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we work on collation support in a separate PR? This one is already pretty big and since Genji is not stable yet we can give ourselves time to improve before locking things up.
Unless you think adding it would not take long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLite performs (simple) case folding for character comparison with LIKE
operator, and since that’s what our implementation is based on, I think it’s reasonable to follow this behavior too.
A proper Unicode support would definitely take some time to implement given the current state of Unicode support in Go (scattered across third-party libraries with different Unicode versions, and each embeds their own character database copy).
Use backtracking in glob.MatchLike
* Increase test and comment coverage * Add more test cases * Add doc comment for glob package * Fix escaping end with non-empty input * Update like.go file header comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That's some awesome work 👏🏼
The PR is still in draft, what do you think is missing to finalize it?
// May you share freely, never taking more than you give. | ||
// | ||
// This is an optimized Go port of the SQLite’s icuLikeCompare routine using backtracking. | ||
// See https://sqlite.org/src/file?name=ext%2Ficu%2Ficu.c&ln=117-195&ci=54b54f02c66c5aea |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for quoting the source 🙏🏼
// Case 4. | ||
var r rune | ||
r, s = readRune(s) | ||
if !equalFold(p, r) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we work on collation support in a separate PR? This one is already pretty big and since Genji is not stable yet we can give ourselves time to improve before locking things up.
Unless you think adding it would not take long?
@tdakkota can you please mark outdated discussions as resolved? |
@tdakkota Could you give us some input on what's missing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Adds LIKE operator support.
Implementation is based on CockroachDB implementation.
Updates #52.