Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: hash for Unicode will call write_u8 like Ascii does #73

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

seanmonstar
Copy link
Owner

@conradludgate
Copy link

This still fails on

let k1 = UniCase::new("Maße");
let k2 = UniCase::ascii("maße");

But I will admit that this seems like a misuse of unicase in that situation, rather than being a reliable behaviour.

@seanmonstar
Copy link
Owner Author

Crazy personal opinion: that probably should yell at you and die.

@marvin-j97
Copy link

To make it worse, "Maße" means something like "measurements", while "Masse" means "mass" (as in weight or group of things/people), so they aren't even the same word.

@seanmonstar
Copy link
Owner Author

seanmonstar commented Dec 23, 2024

Even words that are spelled exactly the same can have different meanings based on context. This crate is not trying to say the words mean the same thing. It's saying that according to the Unicode Case Folding algorithm, they are equivalent for matching purposes.

@conradludgate
Copy link

Putting it into a more formal specification. It's a logic bug if you use UniCase::ascii with non-ascii text, thus any inconsistencies are the fault of the user here and not the library. LGTM

@seanmonstar seanmonstar merged commit de5ebf0 into master Dec 24, 2024
5 checks passed
@seanmonstar seanmonstar deleted the unicode-hash-each-byte branch December 24, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

incorrect hash impl
3 participants