-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fuzzystrmatch: fix crash and compatibility #82761
Conversation
Release note (bug fix): fix behavior of soundex function when passed certain unicode inputs. Previously, certain unicode inputs could result in crashes, errors or incorrect outputs.
cc @mgoddard |
@@ -39,12 +42,25 @@ func TestSoundex(t *testing.T) { | |||
}, | |||
{ | |||
Source: "🌞", | |||
Expected: "000", | |||
Expected: "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this one change? it looks like PG returns 000 for it (binary d0303030
)
}, | ||
{ | ||
Source: "😄 🐃 🐯 🕣 💲 🏜 👞 🔠 🌟 📌", | ||
Expected: "", | ||
}, | ||
{ | ||
Source: "zażółćx", | ||
Expected: "Z200", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PG returns just Z
i think? (binary 5ac5c3c5
)
}, | ||
{ | ||
Source: "K😋", | ||
Expected: "K000", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PG returns K00
(binary 4bd03030
)
The differences you wrote don't materialize on the system I was testing on... what version are you testing on? |
(and i captured the results using wireshark) |
Here's what I get:
It sounds like this functionality just doesn't work consistently for UTF8: |
BTW, I get this same output even in Postgres 12 (I upgraded to Postgres 14 to see if it would repro on your version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see thanks for looking. lgtm then
bors r+ |
Build succeeded: |
Closes #82640
Release note (bug fix): fix behavior of soundex function when passed
certain unicode inputs. Previously, certain unicode inputs could result
in crashes, errors or incorrect outputs.