Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build error: character too large for enclosing character literal type #1

Closed
patrickt opened this issue Jan 7, 2020 · 4 comments
Closed

Comments

@patrickt
Copy link

patrickt commented Jan 7, 2020

Running tree-sitter test produces the following error when compiling scanner.cc:

emcc command failed - src/scanner.cc:88:32: error: character too large for enclosing character literal type
           lexer->lookahead == '\uFEFF' ||
                               ^
src/scanner.cc:89:32: error: character too large for enclosing character literal type
           lexer->lookahead == '\u2060' ||
                               ^
src/scanner.cc:90:32: error: character too large for enclosing character literal type
           lexer->lookahead == '\u200B';
@simonrepp
Copy link
Member

I've encountered this myself - if I remember correctly:

  • You can comment out the affected lines for the time being if you just want to work on stuff (that's what I did so far), these are more exotic whitespace cases which of course matter, but are irrelevant for the overall implementation and general testing.
  • It only occurrs in certain compilation scenarios (when building for the wasm target?)
  • I don't have a solution yet!
  • If you find one I'd be much obliged :)

In any case, thanks for the report! Glad about your interest in this, let me know if you have more questions and/or issues!

@patrickt
Copy link
Author

patrickt commented Jan 7, 2020

I fixed it by removing the character syntax:

  inline bool is_horizontal_whitespace(TSLexer *lexer) {
    return lexer->lookahead == ' ' ||
           lexer->lookahead == '\t' ||
           lexer->lookahead == 0xFEFF ||
           lexer->lookahead == 0x2060 ||
           lexer->lookahead == 0x200B;
  }

@maxbrunsfeld
Copy link

maxbrunsfeld commented Jan 7, 2020

Yeah, I don't think '\uFEFF' is a valid C/C++ expression. Single quoted character literals are of type char, which is almost always an 8-bit value. The largest numerical char value is 0xFF.

In more recent versions of C++, you can use a U prefix (e.g. U'\uFEFF'). I think the normal way to write numbers like this is to just use integer syntax, like @patrickt said ☝️ .

@simonrepp
Copy link
Member

Very cool, thanks patrick for the fix and max for the additional insight, appreciate it!

Commited in c059967

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants