-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
\pC is not accepted #466
Comments
Interesting! There's probably a silly bug somewhere, since that alias is in the Unicode alias table. Other aliases, like |
This commit fixes a bug where 'isc' was canonicalized to 'c'. 'isc' is an alias for 'ISO_Comment', but the 'is' prefix was being dropped since canonicalization permits ignoring 'is' prefixes when designating property names. This is the root cause of a bug in the regex library: rust-lang/regex#466
....wow. That was, indeed, a silly bug.
Should we write the Unicode Consortium and propose a revision to this text to mention
According to my reading of the spec, the regex First, resolve That the alias [TANGENT] Actually, what's your expected policy for tracking new Unicode standards as they release? Things like |
I'm not sure what the right way to interpret the Unicode standard is for this particular case is. I certainly do not have the bandwidth to follow up with Unicode proper. :-)
My plan is to just update all of the tables. I don't think we've generally considered updates to Unicode to be a breaking change in the Rust ecosystem. |
Long form
\p{Other}
is accepted.C
is not what I think of forGeneral_Category=Other
, but it's the official short value. Since regex claims to support the short values for the General_Category sets,\pC
should probably also be supported.The text was updated successfully, but these errors were encountered: