You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems like otherwise-portable lexers in the ANTLR communality frequently use something like this (from LexBasic.g4):
// covers all characters above 0xFF which are not a surrogate// and UTF-16 surrogate pairs encodings for U+10000 to U+10FFFFfragment JavaUnicodeChars
: ~[\u0000-\u00FF\uD800-\uDBFF] {Character.isJavaIdentifierPart(_input.LA(-1))}?
| [\uD800-\uDBFF] [\uDC00-\uDFFF] {Character.isJavaIdentifierPart(Character.toCodePoint((char)_input.LA(-2), (char)_input.LA(-1)))}?
;
Unfortunately this generates code that doesn't work in antlr4ts. There are three problems:
References to _input need to be prefixed with this.
The class Character doesn't exist in ECMAScript/TypeScript
The cast (char) doesn't work in ECMAScript/TypeScript
Issue #1 can be fixed simply with a local variable _input in the code emitted for sematic predicates
Issue #2 can be fixed by defining a Character class w/isJavaIdentiferPart in the antlr4ts runtime.
Thus a lowly cast is the remaining issue... Unfortunately it's not limited to the lack of a char type in ECMAScript, the syntax of casts in typescript is also different!
The text was updated successfully, but these errors were encountered:
One possible solution for #3 might be to have the code generation tool apply some simple transforms to any code it encounters. A simple regex s/\(char\)/<number>/ might do a world of good.
One possible solution for #3 might be to have the code generation tool apply some simple transforms to any code it encounters. A simple regex s/\(char\)/<number>/ might do a world of good.
This should either be implemented as a target-language-agnostic DSL for grammar predicates/actions, or not be implemented at all. The proposed implementation introduces risk of breaking actions written in the correct target language, and may or may not work for any given action (a maintenance and usability nightmare).
The current implementation doesn't support rewriting actions automatically into the target language, but at least the behavior is consistent across all the targets. 😄
It seems like otherwise-portable lexers in the ANTLR communality frequently use something like this (from LexBasic.g4):
Unfortunately this generates code that doesn't work in antlr4ts. There are three problems:
_input
need to be prefixed withthis
.Character
doesn't exist in ECMAScript/TypeScript(char)
doesn't work in ECMAScript/TypeScriptIssue #1 can be fixed simply with a local variable
_input
in the code emitted for sematic predicatesIssue #2 can be fixed by defining a Character class w/isJavaIdentiferPart in the antlr4ts runtime.
Thus a lowly cast is the remaining issue... Unfortunately it's not limited to the lack of a
char
type in ECMAScript, the syntax of casts in typescript is also different!The text was updated successfully, but these errors were encountered: