Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SPARQL lexer #872

Merged
merged 7 commits into from
Aug 30, 2019
Merged

Add SPARQL lexer #872

merged 7 commits into from
Aug 30, 2019

Conversation

noniq
Copy link
Contributor

@noniq noniq commented Feb 6, 2018

This implements a lexer for SPARQL.

I only use SPARQL in the context of Wikidata, so I used (valid) Wikidata queries for all examples and tests.

Lists of keywords and builtins were extracted from the spec (but I may have missed some).

I think the lexer is fairly complete – multiline strings are not implemented, though.

This is my first try on implementing a lexer for Rouge – looking forward to feedback! 😃

@noniq
Copy link
Contributor Author

noniq commented Mar 9, 2018

Ping :)

@waldyrious
Copy link

waldyrious commented Jul 10, 2018

@dblessing any chance you could take a look at this? I would also like to see support for SPARQL.

(By the way, if anyone is looking for a workaround in the meantime, Erlang and Swift seem to do a reasonable job at highlighting a couple queries I tried. You might also take a look at Ceylon, Eiffel, Elm, Haskell and HyLang.)

Copy link
Contributor

@vidarh vidarh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your submission. I'm trying to help triage pull requests and drive down the backlog. Just one minor nit (which I guess is down to the age of this pull request)... Thanks.

lib/rouge/lexers/sparql.rb Show resolved Hide resolved
spec/lexers/sparql_spec.rb Show resolved Hide resolved
@pyrmont pyrmont added the needs-review The PR needs to be reviewed label Aug 2, 2019
@pyrmont
Copy link
Contributor

pyrmont commented Aug 21, 2019

@noniq I'm sorry it's taken so long to get to this. My apologies also for force pushing changes to your branch. I initially planned to make a handful of changes but after comparing the lexer to that in Pygments and Chroma, it looked to me like there were a few areas of the spec that weren't covered. Hopefully this provides better coverage.

Let me know what you think!

@pyrmont pyrmont added author-action The PR has been reviewed but action by the author is needed and removed needs-review The PR needs to be reviewed labels Aug 21, 2019
@pyrmont pyrmont self-assigned this Aug 26, 2019
@noniq
Copy link
Contributor Author

noniq commented Aug 28, 2019

Thanks! One thing I noticed: The new version does not highlight FILTER NOT EXISTS correctly.

Bildschirmfoto 2019-08-29 um 00 39 11

This is because the rule looking for keywords greedily matches exactly one or two uppercase words, so it finds FILTER NOT which indeed is not a valid keyword.

This was not a problem in the original version because it matched only the keywords explicitly defined in KEYWORDS (self.keywords in the new version), so it would find FILTER first, and then NOT EXISTS.

Not sure how to solve this.

@pyrmont
Copy link
Contributor

pyrmont commented Aug 29, 2019

@noniq Both the original lexer and my revised version would tokenise as an error text that didn't match the self.builtins or the self.keywords sets. Given that, and given that not and by only appear in the self.keywords set, could we simplify things by just matching those words separately? That would allow for syntactically incorrect SPARQL to be lexed without errors but parsing is a non-goal of Rouge. What do you think?

Oh, and I should have explained in the earlier commit but the reason I moved away from having a large regular expression pattern was for performance reasons. We try to minimise memory allocation and so avoid using large interpolated patterns where possible.

@pyrmont
Copy link
Contributor

pyrmont commented Aug 30, 2019

@noniq I pushed a version of the lexer doing what I described in the previous message. Just looking at the visual sample made it look to me like it worked fine but let me know what you think.

@noniq
Copy link
Contributor Author

noniq commented Aug 30, 2019

Looks great! Ready to merge? 🚀

@pyrmont pyrmont merged commit 0cc4313 into rouge-ruby:master Aug 30, 2019
@pyrmont
Copy link
Contributor

pyrmont commented Aug 30, 2019

@noniq Thanks for the submission! We're updating Rouge on a two-week cadence as we get through the backlog of outstanding PRs. The next release is scheduled for Tuesday. It will be part of that :)

@pyrmont pyrmont removed the author-action The PR has been reviewed but action by the author is needed label Aug 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants