Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix disallow Unicode bi-directional control characters #11393

Closed

Conversation

straight-shoota
Copy link
Member

This patch changes the lexer rules such that reading a bi-directional control characters results in a syntax error.

Resolves #11392

String literals, symbol names and comments are no longer allowed to contain
bi-directional control characters in order to prevent trojan source
vulnerability.
@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:parser security labels Nov 1, 2021
@@ -3078,6 +3076,9 @@ module Crystal
if error = @reader.error
::raise InvalidByteSequenceError.new("Unexpected byte 0x#{error.to_s(16)} at position #{@reader.pos}, malformed UTF-8")
end
if current_char.in?('\u202A', '\u202B', '\u202C', '\u202D', '\u202E', '\u2066', '\u2067', '\u2068', '\u2069')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we handle nonbreaking space? (U+00A0) That also seems potentially confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, non-breaking space and other invisible characters can be confusing. But at the worst they're just invisible by themselves. They don't manipulate the display of of other visible characters like bidi control characters.
Of course, they should be invalid in most places as well, but I wouldn't necessarily go as far as disallow them entirely. That could be an option, but this definitely needs more discussion to happen in #11216.

@beta-ziliani beta-ziliani modified the milestone: 1.2.2 Nov 4, 2021
@beta-ziliani
Copy link
Member

After discussing it within the core team, we decided to not jump on the gun on this, and give us a time to find better solutions. Thus, I'm removing the 1.2.2 milestone.

@Sija
Copy link
Contributor

Sija commented Nov 4, 2021

@beta-ziliani What are the downsides of merging it for the next release?

@beta-ziliani beta-ziliani added the DON'T MERGE Don't merge yet! This needs further discussion. label Jan 28, 2022
@straight-shoota
Copy link
Member Author

I'm closing this. It's unlikely to get merged. The general issue is still tracked in #11392

I've extracted a tiny little refactoring out into #12590

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DON'T MERGE Don't merge yet! This needs further discussion. kind:bug A bug in the code. Does not apply to documentation, specs, etc. security topic:compiler:parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parser vulnerable to Trojan Source attack
5 participants