Skip to content

Commit

Permalink
fix: Support arbitrary escapes until 0.4.25 (#923)
Browse files Browse the repository at this point in the history
See
#896 (comment)

Arbitrary here means anything that's not `\u` (Unicode escape sequence)
nor `\x` (hex escape sequence). Before 0.4.25, the arbitrary escapes
likes `\a` are allowed in the string but evaluate to `a`, after 0.4.25
it is a parse error in solc. Known/valid escapes like `\n` should still
work correctly across all of the versions.
  • Loading branch information
Xanewok authored Apr 10, 2024
1 parent 2d698eb commit bb30fc1
Show file tree
Hide file tree
Showing 24 changed files with 300 additions and 13 deletions.
5 changes: 5 additions & 0 deletions .changeset/poor-lemons-hammer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@nomicfoundation/slang": patch
---

Support arbitrary ASCII escape sequences in string literals until 0.4.25
46 changes: 42 additions & 4 deletions crates/solidity/inputs/language/src/definition.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3854,9 +3854,21 @@ codegen_language_macros::compile!(Language(
Token(
name = SingleQuotedStringLiteral,
definitions = [
// Allows unicode characters:
// Allows unicode characters and arbitrary escape sequences:
TokenDefinition(
enabled = Till("0.7.0"),
enabled = Till("0.4.25"),
scanner = Sequence([
Atom("'"),
ZeroOrMore(Choice([
Fragment(EscapeSequenceArbitrary),
Not(['\'', '\\', '\r', '\n'])
])),
Atom("'")
])
),
// Allows unicode characters but allows only known ASCII escape sequences:
TokenDefinition(
enabled = Range(from = "0.4.25", till = "0.7.0"),
scanner = Sequence([
Atom("'"),
ZeroOrMore(Choice([
Expand Down Expand Up @@ -3884,9 +3896,21 @@ codegen_language_macros::compile!(Language(
Token(
name = DoubleQuotedStringLiteral,
definitions = [
// Allows unicode characters:
// Allows unicode characters and arbitrary escape sequences:
TokenDefinition(
enabled = Till("0.7.0"),
enabled = Till("0.4.25"),
scanner = Sequence([
Atom("\""),
ZeroOrMore(Choice([
Fragment(EscapeSequenceArbitrary),
Not(['"', '\\', '\r', '\n'])
])),
Atom("\"")
])
),
// Allows unicode characters but allows only known ASCII escape sequences:
TokenDefinition(
enabled = Range(from = "0.4.25", till = "0.7.0"),
scanner = Sequence([
Atom("\""),
ZeroOrMore(Choice([
Expand Down Expand Up @@ -4015,6 +4039,20 @@ codegen_language_macros::compile!(Language(
])
])
),
Fragment(
name = EscapeSequenceArbitrary,
enabled = Till("0.4.25"),
scanner = Sequence([
Atom("\\"),
Choice([
// Prior to 0.4.25, it was legal to "escape" any character (incl. unicode),
// however only the ones from `AsciiEscape` were escaped in practice.
Not(['x', 'u']),
Fragment(HexByteEscape),
Fragment(UnicodeEscape)
])
])
),
Fragment(
name = AsciiEscape,
scanner = Choice([
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 11 additions & 2 deletions crates/solidity/outputs/spec/generated/grammar.ebnf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit bb30fc1

Please sign in to comment.