This repository has been archived by the owner on Jun 3, 2021. It is now read-only.
[ICE] unput doesn't play well with consume_whitespace #361
Labels
fuzz
Found via fuzz testing
ICE
Internal Compiler Error (panic)
lexer
Issue dealing with parsing the lexical tokens of a program
preprocessor
Issue in the preprocessor (probably cycle detection)
Code
Note the error message is from a local copy, I plan to make a PR soon.Merged in #362Expected behavior
The lexer should output the tokens
Hash
,Id("i")
,Str("")
,Slash
, andId("b")
. Additionally,seen_line_token
should be set appropriately at all times.The reason this is hard is because I only want to peek ahead 2 characters, but at the same time the preprocessor needs to know where newlines occur. I think the real fix will be to implement
\n
as a token (#356).Extended description (from discord):
The way the lexer works is it's streaming, it looks at one byte at a time.
Sometimes it needs to look at multiple, but it doesn't have a buffer to store them in, so instead it uses
current
for one byte ahead andlookahead
for 2 bytes ahead.The issue is it's trying to remember 3 bytes when it only has space for 2.
The culprit is
consume_whitespace
, it callspeek_next
when it sees/
which setslookahead
.The preprocessor cares about newlines, even though the lexer doesn't,
so the lexer still needs to keep track of '\n', which involves setting
seen_line_token
.As a result
parse_string
has to unput a newline afterconsume_whitespace
gets rid of it in order forseen_line_token
to get appropriately.See https://github.com/jyn514/rcc/blob/master/src/lex/mod.rs#L610 for more details.
Backtrace
The text was updated successfully, but these errors were encountered: