Skip to content
This repository has been archived by the owner on Jun 3, 2021. It is now read-only.

[ICE] unput doesn't play well with consume_whitespace #361

Open
jyn514 opened this issue Mar 30, 2020 · 1 comment
Open

[ICE] unput doesn't play well with consume_whitespace #361

jyn514 opened this issue Mar 30, 2020 · 1 comment
Labels
fuzz Found via fuzz testing ICE Internal Compiler Error (panic) lexer Issue dealing with parsing the lexical tokens of a program preprocessor Issue in the preprocessor (probably cycle detection)

Comments

@jyn514
Copy link
Owner

jyn514 commented Mar 30, 2020

Code

Note the error message is from a local copy, I plan to make a PR soon. Merged in #362

#i ""
/b
The application panicked (crashed).
Message:  unputting '\n' would cause the lexer to forget it saw 'b' (current is '/')
Location: src/lex/mod.rs:153

Expected behavior

The lexer should output the tokens Hash, Id("i"), Str(""), Slash, and Id("b"). Additionally, seen_line_token should be set appropriately at all times.

The reason this is hard is because I only want to peek ahead 2 characters, but at the same time the preprocessor needs to know where newlines occur. I think the real fix will be to implement \n as a token (#356).

Extended description (from discord):

The way the lexer works is it's streaming, it looks at one byte at a time.
Sometimes it needs to look at multiple, but it doesn't have a buffer to store them in, so instead it uses current for one byte ahead and lookahead for 2 bytes ahead.
The issue is it's trying to remember 3 bytes when it only has space for 2.
The culprit is consume_whitespace, it calls peek_next when it sees / which sets lookahead.
The preprocessor cares about newlines, even though the lexer doesn't,
so the lexer still needs to keep track of '\n', which involves setting seen_line_token.
As a result parse_string has to unput a newline after consume_whitespace gets rid of it in order for seen_line_token to get appropriately.

See https://github.com/jyn514/rcc/blob/master/src/lex/mod.rs#L610 for more details.

Backtrace
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                          (5 post panic frames hidden)                          
 5: rcc::lex::Lexer::unput::h70bde51f6a37bfb0
    at src/lex/mod.rs:153
 6: rcc::lex::Lexer::parse_string::hce2d12999d596f68
    at src/lex/mod.rs:619
 7: <rcc::lex::Lexer as core::iter::traits::iterator::Iterator>::next::{{closure}}::hb2227ef3fa992171
    at src/lex/mod.rs:845
 8: core::option::Option<T>::and_then::hce18bbfb15e478ce
    at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libcore/option.rs:658
 9: <rcc::lex::Lexer as core::iter::traits::iterator::Iterator>::next::hc5dcf9be976562d5
    at src/lex/mod.rs:663
10: rcc::lex::cpp::PreProcessor::next_cpp_token::heb246ff3504f916b
    at src/lex/cpp.rs:431
11: <rcc::lex::cpp::PreProcessor as core::iter::traits::iterator::Iterator>::next::h3eb29428f711ff56
    at src/lex/cpp.rs:204
12: rcc::check_semantics::h763a0909e720b40a
    at src/lib.rs:169
13: rcc::compile::h39be05ac3eaedfb7
    at /home/joshua/src/rust/rcc/rcc/src/lib.rs:217
14: rcc::aot_main::h79a0fdb40de30e55
    at src/main.rs:137
15: rcc::real_main::hb7d5482550942e15
    at src/main.rs:125
16: rcc::main::hd4a33063d9febd1f
    at src/main.rs:203
                        (12 runtime init frames hidden)
@jyn514 jyn514 added ICE Internal Compiler Error (panic) fuzz Found via fuzz testing lexer Issue dealing with parsing the lexical tokens of a program preprocessor Issue in the preprocessor (probably cycle detection) labels Mar 30, 2020
@hdamron17
Copy link
Collaborator

This is not fixed by #437 which closes #356.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
fuzz Found via fuzz testing ICE Internal Compiler Error (panic) lexer Issue dealing with parsing the lexical tokens of a program preprocessor Issue in the preprocessor (probably cycle detection)
Projects
None yet
Development

No branches or pull requests

2 participants