-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use tree-sitter #174
Comments
@FelipeLema: since you wrote an indentation solution using tree-sitter, I would appreciate your comments. I recall you mentioning it a while ago. |
I agree that currently maintaining the regex parser is taking a toll on code readability and maintanability, even considering people who are well versed in (Emacs) Lisp. I agree that tree-sitter is a tool that has many eyes on them ("it is production quality") and that end users have only good things to say about it. However, I'd say it gets complicated for Julia. There's 2 big aspects to consider: how well the parser is maintained and how well it can be maintained (by the Julia community). First, not only is the Julia tree-sitter parser not complete, but its pace may not fit the requirements of the Julia community (incl tooling people & Julia devs). It didn't fit me particularly, I ran into this problem rather quickly. When I started using the tree-sitter indent tool I found a bug. Then I reported it, debugged it, proposed a fix and I'm still waiting on the problem to be addressed. Can the tree-sitter parser be hijacked by the julia community so we can improve it at a quicker pace? I honestly don't know. After that, I endend up writing a Julia code formatter because I thought it would easier to do so rather than pushing for changes upstream in the Julia tree-sitter parser. The name of this tool I did is actually a misnomer because it parses the code in a Julia buffer and does operations on its AST just as tree-sitter does. So this brings up the second concern about tree-sitter: it is written in another language (and right now we need to write fixes in JS for the Julia parser). Correct me if I'm wrong, but I believe that most of the Julia community understands the long-term problems of maintaining several languages for their workflows. I would personally rather avoid writing JS if possible (which is what I ended up doing in the paragraph above). My recommendation is to use CSTParser (or even the parser that comes with Julia binaries) to parse the AST and to copy code from tree-sitter.el to handle the items mentioned at the top entry of this discussion. From my experience with the Julia code formatter I'd recommend using DaemonMode for Emacs-Julia communication (to have little-to-no response delay) as using json-rpc, as LSP does, may bring problems when used in Windows. Using CSTParser may have a positive effect as it lowers the barriers for Julia end users to participate in maintaining this package (kinda what the Racket community was betting on when they switched to ChezScheme). All this being said, I want to note that I use tree-sitter in neovim on an everyday basis and that everything (except for the Julia tree-sitter parser) works A-OK. |
@FelipeLema: thanks for the detailed explanation (incidentally, would you consider giving the PR you mentioned a friendly ping? Perhaps it was just overlooked --- that happens). Conceptually, I can think of the following components for the features we need:
Currently in julia-mode we pretty much do everything above with hacks using regexps. CSTParser.jl does 1+2. We would have to maintain part 3 ourselves, most likely using a daemon-based approach you outline. The advantage is indeed doing a lot in Julia. The disadvantage is all the framework associated with maintaining a running instance of Julia --- doable but somewhat heavyweight. Tree-sitter would allow us to combine effort for 1, 2, and 3 with other projects (editors other than Emacs, languages other than Julia). That said, I realize that a lot of layers can introduce problems, too. Interestingly, the LSP spec includes semantic tokens since 3.16. I wonder if that's supported in practice for Julia with Emacs, @gdkrmr and @non-Jedi, it would be great if you could share your thoughts about this. If we could make that work, it would cover pretty much everything for us. |
Note that julia-snail already includes an interface to CSTParser.jl. It still relies on julia-mode for syntax highlighting and formatting, though. |
LanguageServer.jl doesn't support the semantictokens set of capabilities at this time unfortunately. I'm also really not sure if the architecture of the LSP would give you sufficient responsiveness for indentation and syntax highlighting. Any time you press enter, emacs would have to make a round-trip with the language server (plaintext over pipes) before deciding on indentation level. |
worth noting: there's an active (very active? somewhat active?) support for parsing Julia in Scintilla
using Scintilla would have the benefit of an active support, but would have to integrate it to Emacs ourselves |
I did not dig into the details, but I am still under the impression that tree-sitter would be the path of least resistance, because of:
|
I'm generally on board with integrating with tree-sitter. Even if support for julia syntax isn't perfect, it's probably better than what we have now especially wrt indentation. We would need to update our lowest supported emacs version to 25 for dynamic module support. |
That's fine with me, Emacs 25 has been released almost 6 years ago. |
I don't really have any experience with syntax highlighting in Emacs, please correct me if I am wrong
Questions:
|
Is there anyway to enable tree-sitter in Julia mode? I installed, enabled, but it does not seem to have any effect. |
can you paste or point to the code you're dealing with? |
Nothing in particular. I am just trying to see if we can have a better performance. For example, scrolling this file is somewhat slow for me: https://github.com/ronisbr/PrettyTables.jl/blob/master/src/backends/text/print.jl |
emacs 29 will add native tree sitter support! Does anyone know how it works? Will there be an extra process or is it going to be a dynamic module? |
To use tree-sitter, you need to rewrite the major mode. I am doing some experiment in a |
Here is the major mode: https://github.com/ronisbr/julia-ts-mode You need to add the file I have to say that I am really amazed how easy it was to setup everything and the speed is definitely much faster than the current mode. Now, I need to work on navigation and imenu support. |
This is awesome. Thanks @ronisbr. I'll need to compile emacs 29 for myself to try this. Maybe you could instead define @tpapp would we be willing to make future releases of |
Hi @non-Jedi !
The idea is to test the tree-sitter integration and then commit to this repository. I will not register
Yes, probably we will need to require Emacs 29 to make this integration works. |
Perfect! I will ping this thread when I finish the initial version so that you can help me to integrate everything :) |
Just an update: I have been using Julia tree-sitter grammar for almost 3 weeks now. My doom configuration with this mode is here: https://github.com/ronisbr/doom.d/tree/emacs-29 Everything is working wonderfully! I found just two minor issues reported here: tree-sitter/tree-sitter-julia#88 tree-sitter/tree-sitter-julia#73 The experience so far has been amazing. |
eglot-jl currently explicitly lists julia-mode as a dependency, but it seems like it'd be straightforward to give julia-ts-mode a try. I'm not sure what I'd be looking for -- just see if I don't encounter problems, and maybe if it feels snappier? function foo()
#mark begins
end
function bar()
# mark ends
end Or maybe I need to look into documentation, and I'm supposed to replace functions like |
Yes!
There is some support for navigation, but I did not change anything related to |
@chriselrod Looking at @ronisbr I haven't had a chance to build emacs 29 and test this, but would you mind skimming through the open issues when you get a chance and seeing which ones would be solved by your |
With this, eglot-jl should be compatible with both julia-mode and julia-ts-mode: non-Jedi/eglot-jl#36 Building emacs is fairly straightfoward. sudo dnf install -y dnf-utils libgccjit-devel libtree-sitter-devel stow
sudo yum-builddeps emacs # install a million deps Then to build # cd somewhere/so/you/do/not/clutter
git clone git://git.savannah.gnu.org/emacs.git
cd emacs
./autogen.sh
mkdir build
cd build
CFLAGS="-O3 -march=native -fno-semantic-interposition" CXXFLAGS="-O3 -march=native -fno-semantic-interposition" ../configure --with-native-compilation --with-wide-int --with-json --with-tree-sitter
time make NATIVE_FULL_AOT=1 -j(nproc) # if using fish
# time make NATIVE_FULL_AOT=1 -j$(nproc) # if not using fish
sudo make install prefix=/usr/local/stow/emacs
cd /usr/local/stow
sudo stow emacs
You may want to change flags/configuration options/etc. |
#118 #111 map(1:3) do x
x
end
f(map(1:3) do x
x
end) mark and tab: map(1:3) do x
x
end
f(map(1:3) do x
x
end) x .|>
f It initially didn't have the indent, but as soon as I made another line, it auto-indented. module A
import Base: *
a = 1
b = *
c = 2
end is what I get typing it out (no spurious indent). module A
import Base: *
a = 1
b = *
c = 2
end #11 Typing it out: function1(a, b, c
d, e, f)
function2(
a, b, c
d, e, f)
for i in Float64[1, 2, 3, 4
5, 6, 7, 8]
end
for i in Float64[
1, 2, 3, 4
5, 6, 7, 8]
end
a = function3(function()
return 1
end)
a = function4(
function ()
return 1
end) Neither leading function1(a, b, c
d, e, f)
function2(
a, b, c
d, e, f)
for i in Float64[1, 2, 3, 4
5, 6, 7, 8]
end
for i in Float64[
1, 2, 3, 4
5, 6, 7, 8]
end
a = function3(function()
return 1
end)
a = function4(
function ()
return 1
end) So, still some problems. I'd like As for speed -- I'm not sure. |
Just to confirm, with this file:
Thankfully, we have 5k and 6k long otherwise more typical |
Thanks for the amazing investigation @chriselrod !
Ooops :D I forgot to add anything related with interpolation expressions. Now we have:
I saw problems like this also in other languages. It seems some limitation on either the Emacs implementation or tree sitter itself.
And press enter after
This happens because it is a syntax error. We can highlight errors. However, the grammar is not 100% and some errors are false positives.
Done! I added the support for string interpolations. The font face is the same as in the constant, but bold. (The underlines are LSP errors)
I think I did not fully understand what is the desired behavior. Can you please explain to me?
Yes! If you set |
The By the way, I will add the error in the last font lock level together with the operators. Thus, the user can decide. Now, if you set |
Any time a variable is assigned to, the variable name should be highlighted with
But there are similar forms which should not be highlighted which makes this difficult to do without the full parser we get with tree-sitter, for example, calling a function with a keyword argument, named tuples, and |
Hi @non-Jedi , Thanks! Everything that is not arguable was implemented. However, we might have corner cases. I add the variable highlighting to the level 3 (the default). This is the specification for each level:
Now we have: |
Thanks for all the great work!
For now I went in the other direction, and am trying (defun treesit-font-lock-level-4 ()
(setq-local treesit-font-lock-level 4)
(treesit-font-lock-recompute-features))
(add-hook 'julia-ts-mode-hook #'treesit-font-lock-level-4) Something else I noticed: macros aren't highlighted. |
You're welcome! I also want to point out the AMAZING work of Julia tree-sitter grammar developers (@maxbrunsfeld, @savq, and others). In all this time, I only found very minor issues! It is amazing!
Me too! However, it can slow down. I noticed that the problem is when at the screen there is a lot of highlighting. It seems that it can handle big files pretty well (did not test deeply).
I did not understand, it seems to be working here: By the way, until |
Hmm -- I'm on a different computer with the same config as before, and I now see the same thing you showed. |
I undid this given the amazing advice of @non-Jedi to make |
Update! After a lot of problems, I managed to make an option to select which kind of indentation after assignment the user wants. Hence, we can now select: var = a + b + c +
d + e +
f or var = a + b + c +
d + e +
f |
tree-sitter
framework for incremental parsing of source code. The Julia implementation is now, in my opinion, fairly mature.I am asking for comments about replacing our ad-hoc regexp-based parsing mechanisms with it. Specifically, it would help with
resolving a host of issues.
I am aware that it isn't perfect, as nothing is, but at least improvements would go to a repository that helps all Julia users, not just those who use Emacs.
EDIT Some links:
The text was updated successfully, but these errors were encountered: