-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix disallow Unicode bi-directional control characters #11393
Fix disallow Unicode bi-directional control characters #11393
Conversation
String literals, symbol names and comments are no longer allowed to contain bi-directional control characters in order to prevent trojan source vulnerability.
@@ -3078,6 +3076,9 @@ module Crystal | |||
if error = @reader.error | |||
::raise InvalidByteSequenceError.new("Unexpected byte 0x#{error.to_s(16)} at position #{@reader.pos}, malformed UTF-8") | |||
end | |||
if current_char.in?('\u202A', '\u202B', '\u202C', '\u202D', '\u202E', '\u2066', '\u2067', '\u2068', '\u2069') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle nonbreaking space? (U+00A0) That also seems potentially confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, non-breaking space and other invisible characters can be confusing. But at the worst they're just invisible by themselves. They don't manipulate the display of of other visible characters like bidi control characters.
Of course, they should be invalid in most places as well, but I wouldn't necessarily go as far as disallow them entirely. That could be an option, but this definitely needs more discussion to happen in #11216.
After discussing it within the core team, we decided to not jump on the gun on this, and give us a time to find better solutions. Thus, I'm removing the 1.2.2 milestone. |
@beta-ziliani What are the downsides of merging it for the next release? |
This patch changes the lexer rules such that reading a bi-directional control characters results in a syntax error.
Resolves #11392