-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
significantly improve treesitter performance while editing large files #4716
significantly improve treesitter performance while editing large files #4716
Conversation
f9f20a6
to
48f3396
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool, nice work diving into this!
I'm pretty well out of my depth on the content of the changes; I left some small readability and documentation nits
Co-authored-by: Michael Davis <[email protected]>
3982ae0
to
b3fcf52
Compare
Test failure on windows looks spurious to me. Its some problems with saving files which is not touched at all by this PR |
Ah yep looks like a flake https://github.com/helix-editor/helix/actions/runs/3451158551/jobs/5760200072 I reran it and now it's green |
I wish we could reduce the number of empty injections. |
b3fcf52
to
af4cb3b
Compare
af4cb3b
to
2437828
Compare
I think with this PR the overhead of these empty injections is pretty small. |
let mut layers_table = RawTable::with_capacity(self.layers.len()); | ||
let layers_hasher = RandomState::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to pull in RawTable
? Why not just HashMap::with_capacity()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of the RawTable
is not the capacity
/insert_no_grow
. That just a nice bonus.
A HashMap<K,V>
is essentially a RawTable<(K,V)>
internally. However we want a LanguageLayer -> LayerId
map here so using a normal HashMap<LanguageLayer, LayerId>
would require that we clone all LanguageLayer
s (we can not store references into Syntax::layers
because it is mutated later).
With a RawTable
we can just store the LayerId
as RawTable<LayerId>
.
When we need access to the original keys to resolve hash collisions during lookup later we can just use the LayerId
to index back into self.layers
.
IndexMap
does essentially the same thing internally (it just doesn't fit well here because we want to keep using a HopSlotMap
whereas indexmap
forces us to use essentially a Vec
).
I also use the exact same strategy with RawTable
for the interning in imara-diff
(to avoid pulling in indexmap
which offers a lot more functionality). So we will depend on hashbrown
anyway (in fact we already do so as a dependy of unicode-linebreak
)
// Safety: insert_no_grow is unsafe because it assumes that the table | ||
// has enough capacity to hold additional elements. | ||
// This is always the case as we reserved enough capacity above. | ||
unsafe { layers_table.insert_no_grow(hash, layer_id) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just insert()
? It's guaranteed to avoid allocation if there's enough capacity, but it won't panic if that's not the case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually use insert_no_grow
with RawTable
if the capacity is trivially large enough for two reasons:
- It's faster: Swiss tables need to check an additional byte to determine if a bucket is empty so the overhead is larger then for a
Vec
because you need to read from a random location in the table. The complexity of this bounds check also means its basically never optimized out. insert
requires a closure for rehashing existing entries in case the table needs to be resized (which is impossible because we have enough capacity). That either adds a unnecessary colsure that calculates a hash or a weird lookingunrechable
as shown below so I usually prefer theinsert_no_grow
variant over the other options:
layers_table.insert(hash, layer_id, |_| unreachable!("table should be large enough"));
layers_table.insert(hash, layer_id, |rehashed_layer| hasher.hash_one(self.layers[rehashed_layer])));
// as `Language` (which `Grammar` is an alias for) is just a newtype wrapper around a (thin) pointer. | ||
// This is also compatible with the PartialEq implementation of language | ||
// as that is just a pointer comparison. | ||
let language: *const () = unsafe { transmute(self.config.language) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&self.config.language as *const _ as *const ()
avoids the transmute:
fn main() {
let a = vec![1];
let ptr = &a as *const _ as *const ();
println!("{:?}", ptr);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though hmm, since it's a wrapper type it might not be the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah its not the same, because Grammar is a Newtype wrapper around a pointer:
struct Grammar(*const ());
You can definitely implement this with pointer cast but at that point you are re-implementing transmute_copy
:
let language_ptr: *const Grammar = &self.config.language as *const Grammar;
let raw_language_ptr: *const *const () = language as *const *const ();
let language = *raw_language_ptr;
I think this is one of those rare cases where transmute is the cleanest option. I use it only to get to the inner value of the newtype wrapper here (which is repr(transpraent)
)
[[package]] | ||
name = "hashbrown" | ||
version = "0.13.1" | ||
source = "registry+https://github.com/rust-lang/crates.io-index" | ||
checksum = "33ff8ae62cd3a9102e5637afc8452c55acf3844001bd5374e0b0bd7b6616c038" | ||
dependencies = [ | ||
"ahash 0.8.2", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is pulling in an older version of hashbrown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR actually pulls in the newest version of hashbrown
.
Both textwrap
and unicode-linebreaks
still depend on an old version.
imara-diff
uses the newest version aswell so once the new ropey version lands and my other two PRs are merged we depend on that version anyway.
I do not want to downgrade hashbrown
in imara-diff
, because other crates that depend on it (gitoxide
for example) already use the newest version (and in the future crates will rather upgrade then downgrade).
It's once again one of those unfortunate ecosystem splits where unicode-lines
and textwrap
want to maintain a MSRV of 1.56 but hashbrown
had raised it MSRV to 1.61.
It's a bit unfortunate that these ecosystem splits happens but I think depending on the newest version and waiting for other dependencies to bump is the correct way to go.
Rework the pkgsrc infrastructure to simplify tree-sitter-depends.mk, rewrite the awk script to simplify things and support regular awk, and put it in the usual place. Also add support for Darwin (where this was tested). # 22.12 (2022-12-06) This is a great big release filled with changes from a 99 contributors. A big _thank you_ to you all! As usual, the following is a summary of each of the changes since the last release. For the full log, check out the [git log](https://github.com/helix-editor/helix/compare/22.08.1..22.12). Breaking changes: - Remove readline-like navigation bindings from the default insert mode keymap ([e12690e](helix-editor/helix@e12690e), [#3811](helix-editor/helix#3811), [#3827](helix-editor/helix#3827), [#3915](helix-editor/helix#3915), [#4088](helix-editor/helix#4088)) - Rename `append_to_line` as `insert_at_line_end` and `prepend_to_line` as `insert_at_line_start` ([#3753](helix-editor/helix#3753)) - Swap diagnostic picker and debug mode bindings in the space keymap ([#4229](helix-editor/helix#4229)) - Select newly inserted text on paste or from shell commands ([#4458](helix-editor/helix#4458), [#4608](helix-editor/helix#4608), [#4619](helix-editor/helix#4619), [#4824](helix-editor/helix#4824)) - Select newly inserted surrounding characters on `ms<char>` ([#4752](helix-editor/helix#4752)) - Exit select-mode after executing `replace_*` commands ([#4554](helix-editor/helix#4554)) - Exit select-mode after executing surround commands ([#4858](helix-editor/helix#4858)) - Change tree-sitter text-object keys ([#3782](helix-editor/helix#3782)) - Rename `fleetish` theme to `fleet_dark` ([#4997](helix-editor/helix#4997)) Features: - Bufferline ([#2759](helix-editor/helix#2759)) - Support underline styles and colors ([#4061](helix-editor/helix#4061), [98c121c](helix-editor/helix@98c121c)) - Inheritance for themes ([#3067](helix-editor/helix#3067), [#4096](helix-editor/helix#4096)) - Cursorcolumn ([#4084](helix-editor/helix#4084)) - Overhauled system for writing files and quiting ([#2267](helix-editor/helix#2267), [#4397](helix-editor/helix#4397)) - Autosave when terminal loses focus ([#3178](helix-editor/helix#3178)) - Use OSC52 as a fallback for the system clipboard ([#3220](helix-editor/helix#3220)) - Show git diffs in the gutter ([#3890](helix-editor/helix#3890), [#5012](helix-editor/helix#5012), [#4995](helix-editor/helix#4995)) - Add a logo ([dc1ec56](helix-editor/helix@dc1ec56)) - Multi-cursor completion ([#4496](helix-editor/helix#4496)) Commands: - `file_picker_in_current_directory` (`<space>F`) ([#3701](helix-editor/helix#3701)) - `:lsp-restart` to restart the current document's language server ([#3435](helix-editor/helix#3435), [#3972](helix-editor/helix#3972)) - `join_selections_space` (`A-j`) which joins selections and selects the joining whitespace ([#3549](helix-editor/helix#3549)) - `:update` to write the current file if it is modified ([#4426](helix-editor/helix#4426)) - `:lsp-workspace-command` for picking LSP commands to execute ([#3140](helix-editor/helix#3140)) - `extend_prev_word_end` - the extend variant for `move_prev_word_end` ([7468fa2](helix-editor/helix@7468fa2)) - `make_search_word_bounded` which adds regex word boundaries to the current search register value ([#4322](helix-editor/helix#4322)) - `:reload-all` - `:reload` for all open buffers ([#4663](helix-editor/helix#4663), [#4901](helix-editor/helix#4901)) - `goto_next_change` (`]g`), `goto_prev_change` (`[g`), `goto_first_change` (`[G`), `goto_last_change` (`]G`) textobjects for jumping between VCS changes ([#4650](helix-editor/helix#4650)) Usability improvements and fixes: - Don't log 'LSP not defined' errors in the logfile ([1caba2d](helix-editor/helix@1caba2d)) - Look for the external formatter program before invoking it ([#3670](helix-editor/helix#3670)) - Don't send LSP didOpen events for documents without URLs ([44b4479](helix-editor/helix@44b4479)) - Fix off-by-one in `extend_line_above` command ([#3689](helix-editor/helix#3689)) - Use the original scroll offset when opening a split ([1acdfaa](helix-editor/helix@1acdfaa)) - Handle auto-formatting failures and save the file anyway ([#3684](helix-editor/helix#3684)) - Ensure the cursor is in view after `:reflow` ([#3733](helix-editor/helix#3733)) - Add default rulers and reflow config for git commit messages ([#3738](helix-editor/helix#3738)) - Improve grammar fetching and building output ([#3773](helix-editor/helix#3773)) - Add a `text` language to language completion ([cc47d3f](helix-editor/helix@cc47d3f)) - Improve error handling for `:set-language` ([e8add6f](helix-editor/helix@e8add6f)) - Improve error handling for `:config-reload` ([#3668](helix-editor/helix#3668)) - Improve error handling when passing improper ranges to syntax highlighting ([#3826](helix-editor/helix#3826)) - Render `<code>` tags as raw markup in markdown ([#3425](helix-editor/helix#3425)) - Remove border around the LSP code-actions popup ([#3444](helix-editor/helix#3444)) - Canonicalize the path to the runtime directory ([#3794](helix-editor/helix#3794)) - Add a `themelint` xtask for linting themes ([#3234](helix-editor/helix#3234)) - Re-sort LSP diagnostics after applying transactions ([#3895](helix-editor/helix#3895), [#4319](helix-editor/helix#4319)) - Add a command-line flag to specify the log file ([#3807](helix-editor/helix#3807)) - Track source and tag information in LSP diagnostics ([#3898](helix-editor/helix#3898), [1df32c9](helix-editor/helix@1df32c9)) - Fix theme returning to normal when exiting the `:theme` completion ([#3644](helix-editor/helix#3644)) - Improve error messages for invalid commands in the keymap ([#3931](helix-editor/helix#3931)) - Deduplicate regexs in `search_selection` command ([#3941](helix-editor/helix#3941)) - Split the finding of LSP root and config roots ([#3929](helix-editor/helix#3929)) - Ensure that the cursor is within view after auto-formatting ([#4047](helix-editor/helix#4047)) - Add pseudo-pending to commands with on-next-key callbacks ([#4062](helix-editor/helix#4062), [#4077](helix-editor/helix#4077)) - Add live preview to `:goto` ([#2982](helix-editor/helix#2982)) - Show regex compilation failure in a popup ([#3049](helix-editor/helix#3049)) - Add 'cycled to end' and 'no more matches' for search ([#3176](helix-editor/helix#3176), [#4101](helix-editor/helix#4101)) - Add extending behavior to tree-sitter textobjects ([#3266](helix-editor/helix#3266)) - Add `ui.gutter.selected` option for themes ([#3303](helix-editor/helix#3303)) - Make statusline mode names configurable ([#3311](helix-editor/helix#3311)) - Add a statusline element for total line count ([#3960](helix-editor/helix#3960)) - Add extending behavior to `goto_window_*` commands ([#3985](helix-editor/helix#3985)) - Fix a panic in signature help when the preview is too large ([#4030](helix-editor/helix#4030)) - Add command names to the command palette ([#4071](helix-editor/helix#4071), [#4223](helix-editor/helix#4223), [#4495](helix-editor/helix#4495)) - Find the LSP workspace root from the current document's path ([#3553](helix-editor/helix#3553)) - Add an option to skip indent-guide levels ([#3819](helix-editor/helix#3819), [2c36e33](helix-editor/helix@2c36e33)) - Change focus to modified docs on quit ([#3872](helix-editor/helix#3872)) - Respond to `USR1` signal by reloading config ([#3952](helix-editor/helix#3952)) - Exit gracefully when the close operation fails ([#4081](helix-editor/helix#4081)) - Fix goto/view center mismatch ([#4135](helix-editor/helix#4135)) - Highlight the current file picker document on idle-timeout ([#3172](helix-editor/helix#3172), [a85e386](helix-editor/helix@a85e386)) - Apply transactions to jumplist selections ([#4186](helix-editor/helix#4186), [#4227](helix-editor/helix#4227), [#4733](helix-editor/helix#4733), [#4865](helix-editor/helix#4865), [#4912](helix-editor/helix#4912), [#4965](helix-editor/helix#4965), [#4981](helix-editor/helix#4981)) - Use space as a separator for fuzzy matcher ([#3969](helix-editor/helix#3969)) - Overlay all diagnostics with highest severity on top ([#4113](helix-editor/helix#4113)) - Avoid re-parsing unmodified tree-sitter injections ([#4146](helix-editor/helix#4146)) - Add extending captures for indentation, re-enable python indentation ([#3382](helix-editor/helix#3382), [3e84434](helix-editor/helix@3e84434)) - Only allow either `--vsplit` or `--hsplit` CLI flags at once ([#4202](helix-editor/helix#4202)) - Fix append cursor location when selection anchor is at the end of the document ([#4147](helix-editor/helix#4147)) - Improve selection yanking message ([#4275](helix-editor/helix#4275)) - Log failures to load tree-sitter grammars as errors ([#4315](helix-editor/helix#4315)) - Fix rendering of lines longer than 65,536 columns ([#4172](helix-editor/helix#4172)) - Skip searching `.git` in `global_search` ([#4334](helix-editor/helix#4334)) - Display tree-sitter scopes in a popup ([#4337](helix-editor/helix#4337)) - Fix deleting a word from the end of the buffer ([#4328](helix-editor/helix#4328)) - Pretty print the syntax tree in `:tree-sitter-subtree` ([#4295](helix-editor/helix#4295), [#4606](helix-editor/helix#4606)) - Allow specifying suffixes for file-type detection ([#2455](helix-editor/helix#2455), [#4414](helix-editor/helix#4414)) - Fix multi-byte auto-pairs ([#4024](helix-editor/helix#4024)) - Improve sort scoring for LSP code-actions and completions ([#4134](helix-editor/helix#4134)) - Fix the handling of quotes within shellwords ([#4098](helix-editor/helix#4098)) - Fix `delete_word_backward` and `delete_word_forward` on newlines ([#4392](helix-editor/helix#4392)) - Fix 'no entry found for key' crash on `:write-all` ([#4384](helix-editor/helix#4384)) - Remove lowercase requirement for tree-sitter grammars ([#4346](helix-editor/helix#4346)) - Resolve LSP completion items on idle-timeout ([#4406](helix-editor/helix#4406), [#4797](helix-editor/helix#4797)) - Render diagnostics in the file picker preview ([#4324](helix-editor/helix#4324)) - Fix terminal freezing on `shell_insert_output` ([#4156](helix-editor/helix#4156)) - Allow use of the count in the repeat operator (`.`) ([#4450](helix-editor/helix#4450)) - Show the current theme name on `:theme` with no arguments ([#3740](helix-editor/helix#3740)) - Fix rendering in very large terminals ([#4318](helix-editor/helix#4318)) - Sort LSP preselected items to the top of the completion menu ([#4480](helix-editor/helix#4480)) - Trim braces and quotes from paths in goto-file ([#4370](helix-editor/helix#4370)) - Prevent automatic signature help outside of insert mode ([#4456](helix-editor/helix#4456)) - Fix freezes with external programs that process stdin and stdout concurrently ([#4180](helix-editor/helix#4180)) - Make `scroll` aware of tabs and wide characters ([#4519](helix-editor/helix#4519)) - Correctly handle escaping in `command_mode` completion ([#4316](helix-editor/helix#4316), [#4587](helix-editor/helix#4587), [#4632](helix-editor/helix#4632)) - Fix `delete_char_backward` for paired characters ([#4558](helix-editor/helix#4558)) - Fix crash from two windows editing the same document ([#4570](helix-editor/helix#4570)) - Fix pasting from the blackhole register ([#4497](helix-editor/helix#4497)) - Support LSP insertReplace completion items ([1312682](helix-editor/helix@1312682)) - Dynamically resize the line number gutter width ([#3469](helix-editor/helix#3469)) - Fix crash for unknown completion item kinds ([#4658](helix-editor/helix#4658)) - Re-enable `format_selections` for single selection ranges ([d4f5cab](helix-editor/helix@d4f5cab)) - Limit the number of in-progress tree-sitter query matches ([#4707](helix-editor/helix#4707), [#4830](helix-editor/helix#4830)) - Use the special `#` register with `increment`/`decrement` to change by range number ([#4418](helix-editor/helix#4418)) - Add a statusline element to show number of selected chars ([#4682](helix-editor/helix#4682)) - Add a statusline element showing global LSP diagnostic warning and error counts ([#4569](helix-editor/helix#4569)) - Add a scrollbar to popups ([#4449](helix-editor/helix#4449)) - Prefer shorter matches in fuzzy matcher scoring ([#4698](helix-editor/helix#4698)) - Use key-sequence format for command palette keybinds ([#4712](helix-editor/helix#4712)) - Remove prefix filtering from autocompletion menu ([#4578](helix-editor/helix#4578)) - Focus on the parent buffer when closing a split ([#4766](helix-editor/helix#4766)) - Handle language server termination ([#4797](helix-editor/helix#4797), [#4852](helix-editor/helix#4852)) - Allow `r`/`t`/`f` to work on tab characters ([#4817](helix-editor/helix#4817)) - Show a preview for scratch buffers in the buffer picker ([#3454](helix-editor/helix#3454)) - Set a limit of entries in the jumplist ([#4750](helix-editor/helix#4750)) - Re-use shell outputs when inserting or appending shell output ([#3465](helix-editor/helix#3465)) - Check LSP server provider capabilities ([#3554](helix-editor/helix#3554)) - Improve tree-sitter parsing performance on files with many language layers ([#4716](helix-editor/helix#4716)) - Move indentation to the next line when using `<ret>` on a line with only whitespace ([#4854](helix-editor/helix#4854)) - Remove selections for closed views from all documents ([#4888](helix-editor/helix#4888)) - Improve performance of the `:reload` command ([#4457](helix-editor/helix#4457)) - Properly handle media keys ([#4887](helix-editor/helix#4887)) - Support LSP diagnostic data field ([#4935](helix-editor/helix#4935)) - Handle C-i keycode as tab ([#4961](helix-editor/helix#4961)) - Fix view alignment for jumplist picker jumps ([#3743](helix-editor/helix#3743)) - Use OSC52 for tmux clipboard provider ([#5027](helix-editor/helix#5027)) Themes: - Add `varua` ([#3610](helix-editor/helix#3610), [#4964](helix-editor/helix#4964)) - Update `boo_berry` ([#3653](helix-editor/helix#3653)) - Add `rasmus` ([#3728](helix-editor/helix#3728)) - Add `papercolor_dark` ([#3742](helix-editor/helix#3742)) - Update `monokai_pro_spectrum` ([#3814](helix-editor/helix#3814)) - Update `nord` ([#3792](helix-editor/helix#3792)) - Update `fleetish` ([#3844](helix-editor/helix#3844), [#4487](helix-editor/helix#4487), [#4813](helix-editor/helix#4813)) - Update `flatwhite` ([#3843](helix-editor/helix#3843)) - Add `darcula` ([#3739](helix-editor/helix#3739)) - Update `papercolor` ([#3938](helix-editor/helix#3938), [#4317](helix-editor/helix#4317)) - Add bufferline colors to multiple themes ([#3881](helix-editor/helix#3881)) - Add `gruvbox_dark_hard` ([#3948](helix-editor/helix#3948)) - Add `onedarker` ([#3980](helix-editor/helix#3980), [#4060](helix-editor/helix#4060)) - Add `dark_high_contrast` ([#3312](helix-editor/helix#3312)) - Update `bogster` ([#4121](helix-editor/helix#4121), [#4264](helix-editor/helix#4264)) - Update `sonokai` ([#4089](helix-editor/helix#4089)) - Update `ayu_*` themes ([#4140](helix-editor/helix#4140), [#4109](helix-editor/helix#4109), [#4662](helix-editor/helix#4662), [#4764](helix-editor/helix#4764)) - Update `everforest` ([#3998](helix-editor/helix#3998)) - Update `monokai_pro_octagon` ([#4247](helix-editor/helix#4247)) - Add `heisenberg` ([#4209](helix-editor/helix#4209)) - Add `bogster_light` ([#4265](helix-editor/helix#4265)) - Update `pop-dark` ([#4323](helix-editor/helix#4323)) - Update `rose_pine` ([#4221](helix-editor/helix#4221)) - Add `kanagawa` ([#4300](helix-editor/helix#4300)) - Add `hex_steel`, `hex_toxic` and `hex_lavendar` ([#4367](helix-editor/helix#4367), [#4990](helix-editor/helix#4990)) - Update `tokyonight` and `tokyonight_storm` ([#4415](helix-editor/helix#4415)) - Update `gruvbox` ([#4626](helix-editor/helix#4626)) - Update `dark_plus` ([#4661](helix-editor/helix#4661), [#4678](helix-editor/helix#4678)) - Add `zenburn` ([#4613](helix-editor/helix#4613), [#4977](helix-editor/helix#4977)) - Update `monokai_pro` ([#4789](helix-editor/helix#4789)) - Add `mellow` ([#4770](helix-editor/helix#4770)) - Add `nightfox` ([#4769](helix-editor/helix#4769), [#4966](helix-editor/helix#4966)) - Update `doom_acario_dark` ([#4979](helix-editor/helix#4979)) - Update `autumn` ([#4996](helix-editor/helix#4996)) - Update `acme` ([#4999](helix-editor/helix#4999)) - Update `nord_light` ([#4999](helix-editor/helix#4999)) - Update `serika_*` ([#5015](helix-editor/helix#5015)) LSP configurations: - Switch to `openscad-lsp` for OpenScad ([#3750](helix-editor/helix#3750)) - Support Jsonnet ([#3748](helix-editor/helix#3748)) - Support Markdown ([#3499](helix-editor/helix#3499)) - Support Bass ([#3771](helix-editor/helix#3771)) - Set roots configuration for Elixir and HEEx ([#3917](helix-editor/helix#3917), [#3959](helix-editor/helix#3959)) - Support Purescript ([#4242](helix-editor/helix#4242)) - Set roots configuration for Julia ([#4361](helix-editor/helix#4361)) - Support D ([#4372](helix-editor/helix#4372)) - Increase default language server timeout for Julia ([#4575](helix-editor/helix#4575)) - Use ElixirLS for HEEx ([#4679](helix-editor/helix#4679)) - Support Bicep ([#4403](helix-editor/helix#4403)) - Switch to `nil` for Nix ([433ccef](helix-editor/helix@433ccef)) - Support QML ([#4842](helix-editor/helix#4842)) - Enable auto-format for CSS ([#4987](helix-editor/helix#4987)) - Support CommonLisp ([4176769](helix-editor/helix@4176769)) New languages: - SML ([#3692](helix-editor/helix#3692)) - Jsonnet ([#3714](helix-editor/helix#3714)) - Godot resource ([#3759](helix-editor/helix#3759)) - Astro ([#3829](helix-editor/helix#3829)) - SSH config ([#2455](helix-editor/helix#2455), [#4538](helix-editor/helix#4538)) - Bass ([#3771](helix-editor/helix#3771)) - WAT (WebAssembly text format) ([#4040](helix-editor/helix#4040), [#4542](helix-editor/helix#4542)) - Purescript ([#4242](helix-editor/helix#4242)) - D ([#4372](helix-editor/helix#4372), [#4562](helix-editor/helix#4562)) - VHS ([#4486](helix-editor/helix#4486)) - KDL ([#4481](helix-editor/helix#4481)) - XML ([#4518](helix-editor/helix#4518)) - WIT ([#4525](helix-editor/helix#4525)) - ENV ([#4536](helix-editor/helix#4536)) - INI ([#4538](helix-editor/helix#4538)) - Bicep ([#4403](helix-editor/helix#4403), [#4751](helix-editor/helix#4751)) - QML ([#4842](helix-editor/helix#4842)) - CommonLisp ([4176769](helix-editor/helix@4176769)) Updated languages and queries: - Zig ([#3621](helix-editor/helix#3621), [#4745](helix-editor/helix#4745)) - Rust ([#3647](helix-editor/helix#3647), [#3729](helix-editor/helix#3729), [#3927](helix-editor/helix#3927), [#4073](helix-editor/helix#4073), [#4510](helix-editor/helix#4510), [#4659](helix-editor/helix#4659), [#4717](helix-editor/helix#4717)) - Solidity ([20ed8c2](helix-editor/helix@20ed8c2)) - Fish ([#3704](helix-editor/helix#3704)) - Elixir ([#3645](helix-editor/helix#3645), [#4333](helix-editor/helix#4333), [#4821](helix-editor/helix#4821)) - Diff ([#3708](helix-editor/helix#3708)) - Nix ([665e27f](helix-editor/helix@665e27f), [1fe3273](helix-editor/helix@1fe3273)) - Markdown ([#3749](helix-editor/helix#3749), [#4078](helix-editor/helix#4078), [#4483](helix-editor/helix#4483), [#4478](helix-editor/helix#4478)) - GDScript ([#3760](helix-editor/helix#3760)) - JSX and TSX ([#3853](helix-editor/helix#3853), [#3973](helix-editor/helix#3973)) - Ruby ([#3976](helix-editor/helix#3976), [#4601](helix-editor/helix#4601)) - R ([#4031](helix-editor/helix#4031)) - WGSL ([#3996](helix-editor/helix#3996), [#4079](helix-editor/helix#4079)) - C# ([#4118](helix-editor/helix#4118), [#4281](helix-editor/helix#4281), [#4213](helix-editor/helix#4213)) - Twig ([#4176](helix-editor/helix#4176)) - Lua ([#3552](helix-editor/helix#3552)) - C/C++ ([#4079](helix-editor/helix#4079), [#4278](helix-editor/helix#4278), [#4282](helix-editor/helix#4282)) - Cairo ([17488f1](helix-editor/helix@17488f1), [431f9c1](helix-editor/helix@431f9c1), [09a6df1](helix-editor/helix@09a6df1)) - Rescript ([#4356](helix-editor/helix#4356)) - Zig ([#4409](helix-editor/helix#4409)) - Scala ([#4353](helix-editor/helix#4353), [#4697](helix-editor/helix#4697), [#4701](helix-editor/helix#4701)) - LaTeX ([#4528](helix-editor/helix#4528), [#4922](helix-editor/helix#4922)) - SQL ([#4529](helix-editor/helix#4529)) - Python ([#4560](helix-editor/helix#4560)) - Bash/Zsh ([#4582](helix-editor/helix#4582)) - Nu ([#4583](helix-editor/helix#4583)) - Julia ([#4588](helix-editor/helix#4588)) - Typescript ([#4703](helix-editor/helix#4703)) - Meson ([#4572](helix-editor/helix#4572)) - Haskell ([#4800](helix-editor/helix#4800)) - CMake ([#4809](helix-editor/helix#4809)) - HTML ([#4829](helix-editor/helix#4829), [#4881](helix-editor/helix#4881)) - Java ([#4886](helix-editor/helix#4886)) - Go ([#4906](helix-editor/helix#4906), [#4969](helix-editor/helix#4969), [#5010](helix-editor/helix#5010)) - CSS ([#4882](helix-editor/helix#4882)) - Racket ([#4915](helix-editor/helix#4915)) - SCSS ([#5003](helix-editor/helix#5003)) Packaging: - Filter relevant source files in the Nix flake ([#3657](helix-editor/helix#3657)) - Build a binary for `aarch64-linux` in the release CI ([038a91d](helix-editor/helix@038a91d)) - Build an AppImage for `aarch64-linux` in the release CI ([b738031](helix-editor/helix@b738031)) - Enable CI builds for `riscv64-linux` ([#3685](helix-editor/helix#3685)) - Support preview releases in CI ([0090a2d](helix-editor/helix@0090a2d)) - Strip binaries built in CI ([#3780](helix-editor/helix#3780)) - Fix the development shell for the Nix Flake on `aarch64-darwin` ([#3810](helix-editor/helix#3810)) - Raise the MSRV and create an MSRV policy ([#3896](helix-editor/helix#3896), [#3913](helix-editor/helix#3913), [#3961](helix-editor/helix#3961)) - Fix Fish completions for `--config` and `--log` flags ([#3912](helix-editor/helix#3912)) - Use builtin filenames option in Bash completion ([#4648](helix-editor/helix#4648))
helix-editor#4716) * significantly improve treesitter performance while editing large files * Apply stylistic suggestions from code review Co-authored-by: Michael Davis <[email protected]> * use PartialEq and Hash instead of a freestanding function Co-authored-by: Michael Davis <[email protected]>
helix-editor#4716) * significantly improve treesitter performance while editing large files * Apply stylistic suggestions from code review Co-authored-by: Michael Davis <[email protected]> * use PartialEq and Hash instead of a freestanding function Co-authored-by: Michael Davis <[email protected]>
While working on an unrelated PR related to treesitter performance I noticed that editing text is extremely slow for large files with loads of injections.
The testcase I was looking at during testing is the largest file in the Linux kernel:
drivers/gpu/drm/amd/include/asic_reg/dcn/dcn_3_2_0_sh_mask.h
From the flamegraph (see below) I noticed that most of the time is actually spent inside helix comparing language layers.
After some digging I found that helix checks if an injection layer is already present (and hence can be incrementally updated) by iterating the entire list of injections. This has worst-case performance of
O(N^2)
.For example the Linux kernel header I was investigating contains 27859 injections (comments) all inside the top layer, so about 776 million comparisons are performed (on every edit).
This PR fixes this problem by replacing the linear search with a Hashtable that is built at the beginning of the update function.
The Hashtable allows replacing the linear search with
O(1)
lookups, so the time complexity is decreased toO(N)
.I am using a
RawTable
fromhashbrown
here because I did not want to change the collection oflayers
(the closes substitute is aIndexMap
, but that does not allow reusing indices like theHopeSlotMap
).After this change there is still some delay in the huge Linux kernel header while typing but now typing a single latter takes maybe 0.4s compared to 2s previously (very unscientifically benchmarked).
Looking at the flamegraph (see below) now the bottleneck is treesitter itself (which might be potentially speedup in the future by setting a byte range on the query cursor although that will be more challenging).
While the file provided here was an extreme case, this should be a win (or not matter) for all languages/files.