-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SPARQL lexer #872
Add SPARQL lexer #872
Conversation
Ping :) |
@dblessing any chance you could take a look at this? I would also like to see support for SPARQL. (By the way, if anyone is looking for a workaround in the meantime, Erlang and Swift seem to do a reasonable job at highlighting a couple queries I tried. You might also take a look at Ceylon, Eiffel, Elm, Haskell and HyLang.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for your submission. I'm trying to help triage pull requests and drive down the backlog. Just one minor nit (which I guess is down to the age of this pull request)... Thanks.
17d2b6a
to
08e9562
Compare
@noniq I'm sorry it's taken so long to get to this. My apologies also for force pushing changes to your branch. I initially planned to make a handful of changes but after comparing the lexer to that in Pygments and Chroma, it looked to me like there were a few areas of the spec that weren't covered. Hopefully this provides better coverage. Let me know what you think! |
Thanks! One thing I noticed: The new version does not highlight This is because the rule looking for keywords greedily matches exactly one or two uppercase words, so it finds This was not a problem in the original version because it matched only the keywords explicitly defined in Not sure how to solve this. |
@noniq Both the original lexer and my revised version would tokenise as an error text that didn't match the Oh, and I should have explained in the earlier commit but the reason I moved away from having a large regular expression pattern was for performance reasons. We try to minimise memory allocation and so avoid using large interpolated patterns where possible. |
@noniq I pushed a version of the lexer doing what I described in the previous message. Just looking at the visual sample made it look to me like it worked fine but let me know what you think. |
Looks great! Ready to merge? 🚀 |
@noniq Thanks for the submission! We're updating Rouge on a two-week cadence as we get through the backlog of outstanding PRs. The next release is scheduled for Tuesday. It will be part of that :) |
This implements a lexer for SPARQL.
I only use SPARQL in the context of Wikidata, so I used (valid) Wikidata queries for all examples and tests.
Lists of keywords and builtins were extracted from the spec (but I may have missed some).
I think the lexer is fairly complete – multiline strings are not implemented, though.
This is my first try on implementing a lexer for Rouge – looking forward to feedback! 😃