Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Tree-sitter grammars #4342

Closed
shaunlebron opened this issue Dec 5, 2018 · 11 comments
Closed

Supporting Tree-sitter grammars #4342

shaunlebron opened this issue Dec 5, 2018 · 11 comments

Comments

@shaunlebron
Copy link
Contributor

I'm not sure if GitHub is using tree-sitter for syntax-highlighting, but I saw in #4013 that the grammars are not supported in some way.

I created a syntax-highlighter using tree-sitter for my own purposes, and thought it might be helpful to share here:
https://github.com/shaunlebron/highlight-tree-sitter

@pchaigno
Copy link
Contributor

pchaigno commented Dec 5, 2018

As far as I know, GitHub doesn't support tree-sitter grammars. This is not something that depends on Linguist anyway, so you should probably mention it to GitHub support if you'd like them to support tree-sitter grammars.

@Alhadis
Copy link
Collaborator

Alhadis commented Dec 5, 2018

Background: Atom uses tree-sitter since it is a fast way to use proper grammars in an editor, removing the need for hacky regexes.

Just an FYI: those "hacky regexes" are precisely the reason for the flexibility and power of TextMate-based grammars. 😉 One can use them to write structured grammars a la tree-sitter, or to highlight some ad-hoc format which lacks conventional or defined structure.

Having said that, supporting tree-sitter grammars won't be as simple as flicking on a light switch, so to speak.

@pchaigno
Copy link
Contributor

pchaigno commented Dec 5, 2018

@Alhadis Oh, Atom supports tree-sitter? If it does, it might be in GitHub's plans to support it as well...

@Alhadis
Copy link
Collaborator

Alhadis commented Dec 5, 2018

The Atom developers started the tree-sitter project, so yes, it's only natural that Atom supports it. 😉

@pchaigno
Copy link
Contributor

pchaigno commented Dec 5, 2018

Ahah! @vmg might know if there's planned support for tree-sitter in GitHub's syntax highlighter then.

@shaunlebron
Copy link
Contributor Author

shaunlebron commented Dec 6, 2018

@Alhadis thanks for the note on "hacky regexes", I reworded it to remove the snarkiness since regexes have their place 👍

I also realized that whatever GitHub uses to do its syntax-highlighting is probably private? Linguist only identifies which external grammars to use, and the grammar repos have nothing to perform the actual highlighting as far as I know:

Linguist detects the language of a file but the actual syntax-highlighting is powered by a set of language grammars which are included in this project as a set of submodules as listed here.

GitHub already diffs syntax trees created by tree-sitter for displaying Pull Request toc's, but doesn't seem to be using them for syntax-highlighting.

@Alhadis
Copy link
Collaborator

Alhadis commented Dec 6, 2018

since regexes have their place

It's actually more than just regular expressions. 😉 TextMate's strongest feature is its unassuming simplicity, and the ease with which structured grammars can be built from composing groups of smaller expressions.

It's also cheap and fast to syntax highlight a flat file in a top-down pass, whereas Tree Sitter obviously has to parse and pull an entire AST into memory before it can highlight regions of source code. For an interactive text-editor, it makes senes… but for the millions of static files being viewed across GitHub, the added overhead is wasted.

@shaunlebron
Copy link
Contributor Author

@Alhadis thanks for extra context, I suppose server-side rendered files would make it a better fit

@vmg
Copy link
Contributor

vmg commented Dec 10, 2018

Thanks for the contribution @shaunlebron! We've been exploring using Tree Sitter for syntax highlighting on the website, but there are many technical challenges to overcome. We'll keep y'all posted.

@pchaigno
Copy link
Contributor

Thanks @vmg for the info.!

I think we should close this in the meantime. As long as the backend doesn't support Tree Sitter, there is nothing we can do on the Linguist side.

@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@vmg @shaunlebron @pchaigno @Alhadis and others