-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line directive emitted in middle of multi-line string which contains an ifdef or ifndef #225
Comments
I'm wondering why the tokenizer doesn't catch that one... Thanks for the report! |
Unfortunately changes to test code aren't properly propagated in the build system - try touching test.cfg to rerun. |
I did notice that occurring and figured out that trick. This particular test is run as default_hooks.exe. I was able to trigger it to rerun by removing the default_hooks.run and default_hooks.test files. I also ran it directly to make sure it was actually running. Seems like it passed in the PR automation as well. |
Interestingly it seems that multi-line string literals are not allowed in C - at least not like that. You can have raw string literals with a different syntax. Not that Wave produces an error or anything... but gcc does, in my experiments. What's your use case? |
Some additional insight into what is happening here: in #140 I made handling of The big question for me now is what the correct behavior should be given that multi-line string constants as you have them may be errors. Can you use raw string literals instead? Or string concatenation? Should Wave actually throw an exception? |
I'm not a language lawyer, however I'd say this is ill-formed code. https://eel.is/c++draft/lex.pptoken#2 describes what preprocessing tokens are and string literals are explicitly listed as one. OTOH, a string literal is not allowed to contain newline characters. Those can consist of "any member of the translation character set except the U+0022 quotation mark, U+005c reverse solidus, or new-line character" (see https://eel.is/c++draft/lex.string#nt:basic-s-char). |
One of teammates recently posted an issue and fix so I'll borrow his language for our general setup: Essentially, we have a DSL that we've created and we have several hundred repos coded against that DSL. Our clients have learned to abuse the preprocessor a bit to work around some of the sort comings of our DSL. We do have some ability to fix our clients' code, but tracking down where and how often they use the preprocessor like this isn't exactly easy. We do a test build of a handful of our clients' repos before we rollout changes to them and we caught at least one instance of them using the preprocessor like this. Fixing this issue would certainly make my life easier 😅 But I can also understand that a 20 year old MSVC preprocessor might not have been the most spec-compliant thing 😆 |
Honestly sounds like "throw exception" might be a better action for Wave to do here - we do try to be standard compliant, and to match gcc (especially when they are standard compliant). Furthermore it would make it easier for you to track down "where and how often they use the preprocessor like this". String concatenation should give users what they need here despite the extra work. And finally, what other bugs might Wave have w.r.t. multi-line string literals? We'd be buying into a maintenance headache. |
I'd suggest converting your multi-line strings either into raw string literals or to convert the newlines into |
I think that's fair and reasonable. I would agree it's better to stay standard compliant. |
@hkaiser should we add a |
I tend towards reporting this as a problem, if possible. This would have to be reported after macro expansion, however. So we would have to go back and reparse if we want to report things. |
I feel like the problem is upstream of macro expansion, namely, that we accept multi-line strings at all. On top of that, of course, is the fact that we are currently interpreting directives inside such strings. But if the bad "string literals" themselves were errors we would never get to the expansion and generating the line directives. i.e.:
and
are equal errors and should produce the same kind of "unterminated string literal" exception on the first line. In short, I think we have a pure lexing issue. We lex the first case like this:
and the second like this:
but a proper single-line string with the same contents:
is just:
So the lexer is getting lost right away. |
@jefftrull I assumed it is possible to constuct a string from macros piece by piece, i.e.
but other compilers don't accept that. So I think you're right, string constants must be valid before macro expansion. |
Closing this and creating a different issue about how we accept unterminated string literals as multiple tokens |
Hi, I'm running into an issue where a line directive is being emitted in the middle of a multi-line string if there's an ifdef or ifndef directive in the middle of the string. The line directive was not emitted in Boost 1.77.0.
I suspect this issue was introduced in this PR: #140
After reverting that change, I no longer see line directives emitted in the middle of the multi-line string.
Here's an example to illustrate the issue:
test.cpp
The following are using
default_preprocessing_hooks
.Boost 1.77.0 preprocessed output:
Boost 1.86.0 preprocessed output:
Boost 1.86.0 output is identical to Boost 1.77.0 output if I revert #140
And for comparison, here's output from Visual Studio's cl.exe preprocessor:
The text was updated successfully, but these errors were encountered: