-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexer: use Crystal::Token::Kind
enum instead of symbols
#11616
Lexer: use Crystal::Token::Kind
enum instead of symbols
#11616
Conversation
src/compiler/crystal/syntax/token.cr
Outdated
{% for member in @type.constants %} | ||
when {{ @type.constant(member) }} | ||
{% if member.starts_with?("OP_") %} | ||
{% parts = member.split("_") %} | ||
{{ parts.map { |ch| operator1[ch] || "" }.join("") }} | ||
{% elsif member.starts_with?("MAGIC_") %} | ||
{{ "__#{member[6..-1].id}__" }} | ||
{% else %} | ||
{{ member.stringify }} | ||
{% end %} | ||
{% end %} | ||
else | ||
value.to_s | ||
end | ||
{% end %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this generative implementation. I would prefer to spell everything out.
This is much more complex and harder to understand than a simple list of when OP_CARET then "^"
. It ties constant names closely to their string representation which reduces flexibility. I'd like more flexibility and to consider some improvements to operator names.
And in the end it only saves a couple of LOC, where the saved LOCs would be super trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like ¯\_(ツ)_/¯
The result is compile-time and can be inspected via {% debug %}
if wanted.
I'd like more flexibility and to consider some improvements to operator names.
I even think that operator names shouldn't be "clever". This approach ensures that there is no ambiguity or bikeshedding (beyond the names of each character, anyway)
{% debug %}
case self
in EOF
"EOF"
in SPACE
"SPACE"
in NEWLINE
"NEWLINE"
in IDENT
"IDENT"
in CONST
"CONST"
in INSTANCE_VAR
"INSTANCE_VAR"
in CLASS_VAR
"CLASS_VAR"
in CHAR
"CHAR"
in STRING
"STRING"
in SYMBOL
"SYMBOL"
in NUMBER
"NUMBER"
in UNDERSCORE
"UNDERSCORE"
in COMMENT
"COMMENT"
in DELIMITER_START
"DELIMITER_START"
in DELIMITER_END
"DELIMITER_END"
in STRING_ARRAY_START
"STRING_ARRAY_START"
in INTERPOLATION_START
"INTERPOLATION_START"
in SYMBOL_ARRAY_START
"SYMBOL_ARRAY_START"
in STRING_ARRAY_END
"STRING_ARRAY_END"
in GLOBAL
"GLOBAL"
in GLOBAL_MATCH_DATA_INDEX
"GLOBAL_MATCH_DATA_INDEX"
in MAGIC_DIR
"__DIR__"
in MAGIC_END_LINE
"__END_LINE__"
in MAGIC_FILE
"__FILE__"
in MAGIC_LINE
"__LINE__"
in MACRO_LITERAL
"MACRO_LITERAL"
in MACRO_EXPRESSION_START
"MACRO_EXPRESSION_START"
in MACRO_CONTROL_START
"MACRO_CONTROL_START"
in MACRO_VAR
"MACRO_VAR"
in MACRO_END
"MACRO_END"
in OP_BANG
"!"
in OP_BANG_EQ
"!="
in OP_BANG_TILDE
"!~"
in OP_DOLLAR_QUESTION
"$?"
in OP_DOLLAR_TILDE
"$~"
in OP_PERCENT
"%"
in OP_PERCENT_EQ
"%="
in OP_PERCENT_CURLYR
"%}"
in OP_AMP
"&"
in OP_AMP_AMP
"&&"
in OP_AMP_AMP_EQ
"&&="
in OP_AMP_STAR
"&*"
in OP_AMP_STAR_STAR
"&**"
in OP_AMP_STAR_EQ
"&*="
in OP_AMP_PLUS
"&+"
in OP_AMP_PLUS_EQ
"&+="
in OP_AMP_MINUS
"&-"
in OP_AMP_MINUS_EQ
"&-="
in OP_AMP_EQ
"&="
in OP_PARENL
"("
in OP_PARENR
")"
in OP_STAR
"*"
in OP_STAR_STAR
"**"
in OP_STAR_STAR_EQ
"**="
in OP_STAR_EQ
"*="
in OP_PLUS
"+"
in OP_PLUS_EQ
"+="
in OP_COMMA
","
in OP_MINUS
"-"
in OP_MINUS_EQ
"-="
in OP_MINUS_GT
"->"
in OP_PERIOD
"."
in OP_PERIOD_PERIOD
".."
in OP_PERIOD_PERIOD_PERIOD
"..."
in OP_SLASH
"/"
in OP_SLASH_SLASH
"//"
in OP_SLASH_SLASH_EQ
"//="
in OP_SLASH_EQ
"/="
in OP_COLON
":"
in OP_COLON_COLON
"::"
in OP_SEMICOLON
";"
in OP_LT
"<"
in OP_LT_LT
"<<"
in OP_LT_LT_EQ
"<<="
in OP_LT_EQ
"<="
in OP_LT_EQ_GT
"<=>"
in OP_EQ
"="
in OP_EQ_EQ
"=="
in OP_EQ_EQ_EQ
"==="
in OP_EQ_GT
"=>"
in OP_EQ_TILDE
"=~"
in OP_GT
">"
in OP_GT_EQ
">="
in OP_GT_GT
">>"
in OP_GT_GT_EQ
">>="
in OP_QUESTION
"?"
in OP_AT_SQUAREL
"@["
in OP_SQUAREL
"["
in OP_SQUAREL_SQUARER
"[]"
in OP_SQUAREL_SQUARER_EQ
"[]="
in OP_SQUAREL_SQUARER_QUESTION
"[]?"
in OP_SQUARER
"]"
in OP_CARET
"^"
in OP_CARET_EQ
"^="
in OP_GRAVE
"`"
in OP_CURLYL
"{"
in OP_CURLYL_PERCENT
"{%"
in OP_CURLYL_CURLYL
"{{"
in OP_BAR
"|"
in OP_BAR_EQ
"|="
in OP_BAR_BAR
"||"
in OP_BAR_BAR_EQ
"||="
in OP_CURLYR
"}"
in OP_TILDE
"~"
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am on board with this change. Only left small comments.
Just making sure to cc @asterite here: what is your opinion, does this make it less fun to hack on the compiler?
src/compiler/crystal/syntax/token.cr
Outdated
{% for member in @type.constants %} | ||
when {{ @type.constant(member) }} | ||
{% if member.starts_with?("OP_") %} | ||
{% parts = member.split("_") %} | ||
{{ parts.map { |ch| operator1[ch] || "" }.join("") }} | ||
{% elsif member.starts_with?("MAGIC_") %} | ||
{{ "__#{member[6..-1].id}__" }} | ||
{% else %} | ||
{{ member.stringify }} | ||
{% end %} | ||
{% end %} | ||
else | ||
value.to_s | ||
end | ||
{% end %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like ¯\_(ツ)_/¯
The result is compile-time and can be inspected via {% debug %}
if wanted.
I'd like more flexibility and to consider some improvements to operator names.
I even think that operator names shouldn't be "clever". This approach ensures that there is no ambiguity or bikeshedding (beyond the names of each character, anyway)
{% debug %}
case self
in EOF
"EOF"
in SPACE
"SPACE"
in NEWLINE
"NEWLINE"
in IDENT
"IDENT"
in CONST
"CONST"
in INSTANCE_VAR
"INSTANCE_VAR"
in CLASS_VAR
"CLASS_VAR"
in CHAR
"CHAR"
in STRING
"STRING"
in SYMBOL
"SYMBOL"
in NUMBER
"NUMBER"
in UNDERSCORE
"UNDERSCORE"
in COMMENT
"COMMENT"
in DELIMITER_START
"DELIMITER_START"
in DELIMITER_END
"DELIMITER_END"
in STRING_ARRAY_START
"STRING_ARRAY_START"
in INTERPOLATION_START
"INTERPOLATION_START"
in SYMBOL_ARRAY_START
"SYMBOL_ARRAY_START"
in STRING_ARRAY_END
"STRING_ARRAY_END"
in GLOBAL
"GLOBAL"
in GLOBAL_MATCH_DATA_INDEX
"GLOBAL_MATCH_DATA_INDEX"
in MAGIC_DIR
"__DIR__"
in MAGIC_END_LINE
"__END_LINE__"
in MAGIC_FILE
"__FILE__"
in MAGIC_LINE
"__LINE__"
in MACRO_LITERAL
"MACRO_LITERAL"
in MACRO_EXPRESSION_START
"MACRO_EXPRESSION_START"
in MACRO_CONTROL_START
"MACRO_CONTROL_START"
in MACRO_VAR
"MACRO_VAR"
in MACRO_END
"MACRO_END"
in OP_BANG
"!"
in OP_BANG_EQ
"!="
in OP_BANG_TILDE
"!~"
in OP_DOLLAR_QUESTION
"$?"
in OP_DOLLAR_TILDE
"$~"
in OP_PERCENT
"%"
in OP_PERCENT_EQ
"%="
in OP_PERCENT_CURLYR
"%}"
in OP_AMP
"&"
in OP_AMP_AMP
"&&"
in OP_AMP_AMP_EQ
"&&="
in OP_AMP_STAR
"&*"
in OP_AMP_STAR_STAR
"&**"
in OP_AMP_STAR_EQ
"&*="
in OP_AMP_PLUS
"&+"
in OP_AMP_PLUS_EQ
"&+="
in OP_AMP_MINUS
"&-"
in OP_AMP_MINUS_EQ
"&-="
in OP_AMP_EQ
"&="
in OP_PARENL
"("
in OP_PARENR
")"
in OP_STAR
"*"
in OP_STAR_STAR
"**"
in OP_STAR_STAR_EQ
"**="
in OP_STAR_EQ
"*="
in OP_PLUS
"+"
in OP_PLUS_EQ
"+="
in OP_COMMA
","
in OP_MINUS
"-"
in OP_MINUS_EQ
"-="
in OP_MINUS_GT
"->"
in OP_PERIOD
"."
in OP_PERIOD_PERIOD
".."
in OP_PERIOD_PERIOD_PERIOD
"..."
in OP_SLASH
"/"
in OP_SLASH_SLASH
"//"
in OP_SLASH_SLASH_EQ
"//="
in OP_SLASH_EQ
"/="
in OP_COLON
":"
in OP_COLON_COLON
"::"
in OP_SEMICOLON
";"
in OP_LT
"<"
in OP_LT_LT
"<<"
in OP_LT_LT_EQ
"<<="
in OP_LT_EQ
"<="
in OP_LT_EQ_GT
"<=>"
in OP_EQ
"="
in OP_EQ_EQ
"=="
in OP_EQ_EQ_EQ
"==="
in OP_EQ_GT
"=>"
in OP_EQ_TILDE
"=~"
in OP_GT
">"
in OP_GT_EQ
">="
in OP_GT_GT
">>"
in OP_GT_GT_EQ
">>="
in OP_QUESTION
"?"
in OP_AT_SQUAREL
"@["
in OP_SQUAREL
"["
in OP_SQUAREL_SQUARER
"[]"
in OP_SQUAREL_SQUARER_EQ
"[]="
in OP_SQUAREL_SQUARER_QUESTION
"[]?"
in OP_SQUARER
"]"
in OP_CARET
"^"
in OP_CARET_EQ
"^="
in OP_GRAVE
"`"
in OP_CURLYL
"{"
in OP_CURLYL_PERCENT
"{%"
in OP_CURLYL_CURLYL
"{{"
in OP_BAR
"|"
in OP_BAR_EQ
"|="
in OP_BAR_BAR
"||"
in OP_BAR_BAR_EQ
"||="
in OP_CURLYR
"}"
in OP_TILDE
"~"
end
@oprypin Thanks for checking! I'm all on board with replacing symbols with enums. Ideally Symbol won't exist in Crystal 2.0, but that's just my wish. Symbols have no use cases, one can use enums or strings in every case where a symbol could be used. |
Hm, for some reason this ends up including the playground server in |
That is introduced here: b0d3c7a |
There's |
This PR removes one of the biggest uses of symbol variables in the compiler.
The symbol solution is attractive, but ultimately I believe it is better if we can see all possible token kinds at a glance. We also gain the ability to define instance methods on
Crystal::Token::Kind
, e.g.#assignment_operator?
, as opposed to global methods withSymbol
arguments.