-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Support stack-graphs to provide code nav in the absence of LSP, building on tree-sitter #1252
Comments
I think some people want to avoid auto-generating folders. This might be better solved by #401 |
There's |
We also want to use this together with the tree-sitter integration: https://github.com/tree-sitter/tree-sitter-graph Some work needs to be done so that we also parse and construct these trees for not just the current project, but also the standard library and any imported libraries. |
This is already possible with |
Good to know! Does it do workspace symbols though? |
Currently the crate takes a set of queries specifying which nodes are symbols and an input file and spits out the position of the symbols (so document symbols in essence) and if we do this for every file in the tree we would get workspace symbols. |
I think this is a good idea, in some cases I might even prefer to turn off an LSP and just use tags-based navigation. As @sudormrfbin says, stack-graphs is not necessary, it's only used by GitHub to power precise navigation for python. Basic (less accurate) navigation can be done with a I think basic tags are not accurate enough to be used reliably for go-to-definition, so IMO it's not worth parsing dependencies or the standard lib of a language. Being an alternate source of data for (workspace) symbols seems feasible though. See also elixir-lang/tree-sitter-elixir#18 |
Maybe you've already found them, but here is a link to the draft Typescript graph definitions, in case helpful 😊 |
I took a look at stack-graphs for Helix, and I don't feel like it's a good addition for Helix because of how complex it is to define queries while not providing a reasonable amount of accuracy. I particularly dislike the fact that We could refer to GitHub's implementation of the stack-graphs algorithm, though I feel it may need to be modified a bit to better suit Helix. For one, I don't like how it utilizes insert-only arenas for the stack. This could be mitigated by creating partial graphs per file, writing them to disk, and then using memory-mapping to merge them into a single graph on the fly. That itself could be further optimized by utilizing the algorithm described in https://blog.burntsushi.net/transducers/#fsa-construction, though the improvement may be negligible because of the difference in scales. |
I think we should come up with a list of benefits from supporting stack-graphs. I feel that people should use the LSP for complex languages like TypeScript or Python.
|
I have one additional argument for offering an LSP alternative to code navigation that hasn't yet been mentioned in this issue which is codebases that are difficult for LSP to process. As my day job I'm working as a developer on an embedded C codebase that has:
I've been fiddling with Helix lately since I'm interested in switching to a terminal based editor and I very much like the selection -> action over the action -> selection that vim uses. However, I have been unable to generate a proper compile_commands.json (with tools such as bear and compiledb) for even one of our supported build targets so far. Also, I assume that even if I'd manage to generate a full compile_commands.json file for the entire build it would be lacking for my purposes since 'go to definition' for example would just go to the definition that LSP sees as the correct definition for the build target whose compile_commands.json is active at that point. Instead, I'd prefer to be able to choose myself which definition out of the different target/simulation/UT functions I would like to go to. Another (C++, not embedded this time but hardware testing related) codebase that I'm occasionally working on uses a proprietary compiler along with a proprietary build system. I don't think tools exist for generating compile_commands.json for this one. Fortunately the 'go to definition' of sublime text usually works here and when it fails there's always good old grep. As a final disclaimer I don't have experience on how well stack-graphs in particular would work for my purposes but I know that Cscope, Ctags and the like work decently. For me personally the option of being able to use tag based code navigation over LSP would be the final push to allow me to use helix professionally. |
This exact thing I have achieved on neovim. I want to select which implementation/reference I would like to go to. And I wish if this degree of integration is present for ctags in helix, really willing to try! |
this is the killer feature for embedded code development targeting different hardwares with different compile configurations. |
What are stack-graphs
Stack-graphs were developed by Github to provide code navigation features to any of the ~500 programming languages hosted in Github repositories (although they have only defined it for python currently). Examples of such features are "go to definition" and "go to references". It builds on the information obtained from tree-sitter grammars to create incrementally-updated look-up graphs that allow sub-100ms navigation of named symbols.
Why would we want them in Helix?
Helix at this time relies exclusively on tree-sitter to provide basic highlighting and other more advanced features (such as selection by scope, which is still being worked into the default keymap). The conceit is that sourcing or writing a tree-sitter grammar for a language should be the preferred way of implementing support for a language for Helix. If we are to keep with this preference, we should endeavour to reap the maximum benefits of having a tree-sitter parsed syntax tree available for editor functions. This seems to be part of what sets helix apart from other editors.
Helix supports LSP for those features, why not rely on that?
There are a number of reasons:
Resource constraints
Many LSP servers are notoriously memory and cpu intensive. Depending on the kind of development machine you are running on, that may or may not be an issue even for one running LSP server. However, in heterogenous code environments (consider a website that uses typescript on the frontend, PHP on the backend and ruby for it's build system or to provide certain endpoints), or in environments where you are working on a number of projects throughout the day, whether or not all in the same language, it can quickly become limiting to rely on LSP in every case. It will likely be a concious decision to limit to a single LSP for the most used language simply to free up resources.
Not all languages have functional LSPs
Fairly recently, there was a lot of effort to utilize Julia's LSP server in the default configuration of Helix, but it was found that not only was the server excruciatingly slow to start up, but it wasn't entirely conformant or complete. There are certainly other examples of LSPs that only provide partial support for the standard. In those cases, it would be simpler to create a ruleset for stack-graphs to at least provide code navigation, rather than write or improve an entire LSP implementation.
Not all languages should have an LSP
Especially in the web sphere, there are many small domain-specific languages that are less a complete language and more of a configuration format, but which nonetheless support definitions and references of discrete portions of the configuration, and/or drop-in files to organize configuration across a number of files. Apache, nginx and varnish configuration files in particular come to mind. These will likely never get LSP servers written, as the benefit is too small, but it is certainly possible to write a tree-sitter grammer and stack-graph ruleset to make editing those files in Helix more ergonomic.
What else could be derived?
gd
) and go to references (gr
), it seems conceivable that a complete stack-graph can be used to produce a full list of symbols in file, and even workspace symbols.gy
) and go to implementation (gi
) are notably not talked about in the stack-graphs intro talk, but seem like they could be added down the road if not already available.<space>k
) could theoretically utilize definition information and provide that as the popup information, similar to what github does with stack-graphs (try it on a python repo like flask to see what i mean).What's involved in bringing it to Helix
First of all, the code behind the github feature is already open sourced as a Rust crate! Perusing the docs.rs for it, it seems only to provide the core functionality of incrementally creating the graph and querying it. This means we will need to take care of crawling the project (github uses the git trees to discover these) for all files of that type and generating the intial partial graphs. This should be cached somewhere (perhaps in a
.helix
in the project root directory, or in something like$XDG_RUNTIME_DIR/heix/path
). As files are worked on in the editor, the partial graphs would be updated for that file in real time, then as functionality was invoked, we could query the full graph set to get the answer (this is the part that is always within 100ms in Github's testing).Once functionality is present, we would need only find or implement support for each language with a ruleset. I'm not sure where Github stores the ruleset for python or if they intend to share what they develop for each language. I would assume they will, as they specifically called out taking advantage of the tree-sitter parses ecosystem. This would live in the runtime directory alongside the tree-sitter queries.
The text was updated successfully, but these errors were encountered: