-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support a "data"-like field on CompletionList that is also returned to the server in completionItem/resolve to avoid duplication in CompletionItem.data #1802
Comments
Just remember that If we are going to specify a merge operation the client must perform, then the exact detailed merge behaviour and semantics must be fully specified by the protocol. Saying 'what JavaScript does' is not helpful if the client isn't JavaScript. |
I do see the need for something like this due to performance reasons. But before we go down the pass can you collect some numbers that show how much this will speed things up (e.g time it take to show the completion in the UI). It will complicate the protocol and adds effort for clients to implement this and I want to ensure it is worth the effort. |
It's difficult to give concrete numbers because it varies a lot across environments and payloads, and I can't currently measure what difference it would make without implementing it. My main motivation is to reduce completion payload sizes because currently they can be a few megabytes (more on that below) and in Codespaces this can be really slow (like 5-6 seconds). I'm not sure why it's so slow, but I see the data over the websocket is batched into 256kb chunks and the timestamps seem much further apart than I'd expect (). There are a number of contributing factors to the payload being so large:
There are many other things making the payload large which I'm working on, but being able to round-trip some context without duplicating it on every item will definitely help. The whole purpose of Another possibility would be to allow the server to keep some state between If you have other ideas that would be better, I'm all ears :-) |
Have you every tried not sending the data at all but instead doing the following:
I know that this holds onto memory on the server however code complete requests are so frequent that you can always free the data hold for the last request when a new one arrives. This would not only benefit the communication between servers and the extension host but also in VS Code a remote setup. ItemDefaults are a LSP concept and don't exist in VS Code. The data is basically inflated when VS Code talks from the extension host to the renderer. |
I've thought about this a few times (I think we actually did this in the past) but I wasn't sure how reasonable it was for a server to only support resolve for the "last completion request". I can imagine some situations where this might have issues:
I don't think this would affect VS Code (I don't think it filters client-side when it's sending another request?) and it's probably likely that the new completion request would complete and a new item resolved anyway, so perhaps that's not an issue. I'll think about this a bit more and maybe try it out and see how it works. |
For me I ran a test with a whole bunch of completions. Something that returned over 900 items for a completion list. With the data not being passed (I was only sending the |
@dbaeumer if we decide to do this, could we add something to the spec about it? Right now the spec doesn't say anything about whether calling Ideally, the spec would say that clients should only call it for the last one, because that would avoid needing to even round-trip some identifier (which would still be a bit of junk to duplicate on every single completion item in a large list). |
@DanTup only allowing to resolve the a completion item from the last completion request makes total sense to me. |
I think per filename would make more sense. You could have a completion running in one editor and switch away to make a change in another editor. Same with signature help. |
@dbaeumer great, I'll open a PR (and post a link back here in case others in this thread have feedback/suggestions).
This wouldn't be possible without But it also means the server would have to keep a lot more state (the last completion for every file). I think if you switch to another editor and back, it's probably reasonable to just re-invoke code completion (VS Code already does not keep the completion widget open if you switch editor). |
…t completion request See microsoft#1802 (comment)
I opened a PR here that says |
I agree with @DanTup here. When switching editors the client should actually cancel the last completion request since its result might be not correct anymore anyways. |
@dbaeumer are you happy with #1834? (there's a "Community PR Approvals" check that seems stuck?). I'd prefer to have that merged before I start making server changes assuming that's good 🙂 (if we merge that, I think we can close this, as keeping the state on the server provides the ability to do everything that this would) |
We had a longer discussion about that problem and due to that fact the code could directly talk to the server we can't spec that a resolve request can only be sent for items from the last code complete request (although this is the case in 99%). The only way I can see to tackle this is to have an explicit release call that client can send to the server. Something like this:
This will allow servers to keep state for a completion item on the server instead of attaching all state to the completion item itself. This approach however has to go behind a capability flag but it is implementable for VS Code. |
@dbaeumer do you mean releasing each completion item? Adding a unique ID to each completion item feels like it's going to add more to the payload that the goal was to remove. I wonder if we'd be better trying to do the original plan here (a mergeable data) instead? Or, how about a new field ( Something like that seems way simpler - both for LSP/spec, and for servers (no need to keep state, worry about it not being cleaned up, no extra per-item data to track IDs). |
Yes. But I doubt that this will add more data since a single ID field / property would make the whole data property go away on these completion items. I am pretty sure that in total that will be a smaller payload in the cases were servers add state to the data property The problem with the merge is that users will ask for more and more complicated merging algorithms. The next thing I already see users asking for is to allow to template paths since the majority of the paths only differ in small parts. I do understand the need of lowering the payload but I am not convinced that the merging is the right solution. |
It would definitely be smaller than it is today, but it still feels needless verbose. My current goal is to strip everything I don't need from the payload, so trading a large property for a small property for potentially a large number of items is not as good as removing it :-)
To be clear, my last suggestion above involved no merge. I was asking that we have a second field (in addition to textDocument/completion result:{
"context": {
"foo": "bar",
},
"items": [
{
"label": "...",
"data": { "a": "b" }
}
]
} completion / resolve:{
"label": "...",
"data": { "a": "b" }
"context": {
"foo": "bar",
},
} This seems like a much simpler solution than having to release completion items (something I'm not sure clients would bother to implement), has no complexity of merge, and has no restrictions on the ability to call Edit: For my specific case, even just sending the original completion arguments as |
For what's worth from a client implementation perspective I'd much rather have an additional property in the |
@dbaeumer any thoughts/opinions on the above? |
I am still not convinced that this will drastically reduce the amount of data servers add to the data field of a completion item. @DanTup could you provide some example of before and after. @dibarbet and @jdneo do you have any insights / number you can share about the payload C# / Java encodes into the data field and if such a context on completion list would help lowering the payload significantly. |
@dbaeumer it's difficult to give specifics because it depends very much on the specific context (for example how many completion items there are, how many of them need context to provide auto-imports, etc.) but as an example, I just created a new file in a Flutter project which has no dependencies other than Flutter and invoking completion at the top level has the file path repeated 6220 times: That's 78 * 6220 = which is 485,160 characters just to provide the server with the filename of where it will be inserting This number will go down if some of the items are already imported (because they won't need this in Of course, this can be reduced with (I'm aware there are other savings to be made in the screenshot above - I've made some that haven't shipped in the SDK I'm using, and I've still some to make :) ) |
From the Java side, we don't have such request for now. But just in several months ago, we did some refactoring to remove some unnecessary data field in the completion items which helps to improve the completion performance. One example is: We previously stored the document uri into the data field for each completion items (which looks similar as @DanTup mentioned in the dart), and then we found that is not necessary, so we remove it from the data field. After removing that uri string, triggering completion via 'S' in Spring Petclinic project, the response (textDocument/completion) payload size can be reduced from 3.05MB to 2.63MB. (Directly copy the trace string to a text file). And the completion time becomes a little bit faster. More details: eclipse-jdtls/eclipse.jdt.ls#2614. |
@jdneo do I understand correctly that you're stitching this data back in in the LSP client? (eg., it won't work for other generic LSP clients)? I was also considering something like that for Dart if LSP doesn't support it, but it seems silly not to include it in LSP if clients and servers are going to build custom support for exactly this anyway. If both Dart and Java benefit significantly from extracting this from |
In our case, the uri is used during resolve stage. The way we remove that uri is: For every completion item, we have a generated unique id for it. And we add a cache at the server side that maps id -> the context of the completion item (things like uri, etc...). Then we only put that id to the data field. At the completion resolve stage, the server side can recover the context via the id and do some further tasks. |
@jdneo what happens for items that are never resolved? When do you clean up the context? There were some suggestions about this above, but it seemed complicated to manage releasing the context, which is why I was advocating for just adding a new field to |
@dbaeumer the issue of payload sizes came up again and I'd like to make some progress on this. Would you accept PRs based on the above? To summarize, my understanding is: Add to
|
can we do this for every feature that has they all have the exact same issue. since people usually doesn't want to hold onto state that is not clear when to invalidate, all those features, so, this approach could be generally helpful for any feature that support that said, I think another generic approach to reduce size of message for any LSP message is providing a way to intern message data. Basically, making any message to have so rather than returning
one could return
that way when server return any message, they can get rid of any duplicated messages in any way that fits them. and |
@dbaeumer do you have any opinions on my proposal above? (#1802 (comment)) I'm happy to open PRs if we have a general agreement on this (otherwise I may do it using custom flags/middleware - but that's not ideal because it won't be supported by clients other than VS Code). |
@DanTup now that I read the issue again I am a little bit puzzled why we need a |
@dbaeumer the issue is that there are two sets of Concretely, to support In my original comment I suggested supporting merging those, but in #1802 (comment) you didn't seem keen on that. |
@DanTup now that I finally understand what the problem is (your original description with a new
Will this solve your issue? |
@dbaeumer yes, it would :) But some comments:
Doesn't there need to be a capability for this anyway (so the server knows the client suports {
itemDefaults: { data: { foo: "bar" } },
items: [
{ label: 'x' } // gets itemDefaults.data,
{ label: 'y', data: null } // does not get itemDefaults.data
],
} Again, I don't think that's as critical, but might be nice to support. Thanks! |
Actually you are correct that we need a client capability to ensure that servers doesn't request a Are you still up for a PR? |
Yep! I'm still a bit unsure about Otherwise, if you want to use So I think my suggestion would be either,
And some ability for the client to indicate which fields it supports |
Then I would go with
but only one client capability. If it supports applyKind it must support it for data and commitCharacters |
Is it possible in future there might be new fields that might make sense to support merge? If so, wouldn't using a single flag make it more difficult to add that? |
…om `completionList.itemDefaults` and `completion` are combined. Fixes microsoft#1802
@dbaeumer I've started a PR at #2018 for feedback. I use a list as noted above for the reasons I gave, but happy to change if you think we shouldn't do this (I slightly worry that in future there may be new fields we want to support merge for, but the boolean doesn't let us add them or the |
Yes, but the new field must be guarded by a capability. If a client supports the new filed and merge it MUST then support the new filed in the |
Ok, I'll change this to just a flag then - thanks! |
Oh, one thing to note is that this means when new fields are added, it must be decided in that same version if they will get |
My vote would also go for the previous approach with |
I think @dbaeumer's point is that a client needs to be updated to use any new completion fields, so at the point of adding support for a new field in a client, if that client supports merge (and it is a mergeable field), the client should also add support for merging it. However, I do also think listing the fields may be slightly better, because it's harder to accidentally do it wrong (support a new field for forget to support merge for it). |
I think it is fair to request this from clients. And having separate lists makes it harder for servers. They need to honor them :-) |
…d per-item commitCharacters/data are combined Implements the changes in the LSP spec PR at microsoft/language-server-protocol#2018. (Also see microsoft/language-server-protocol#1802)
…d per-item commitCharacters/data are combined (#1558) * Add support for CompletionList "applyKind" to control how defaults and per-item commitCharacters/data are combined Implements the changes in the LSP spec PR at microsoft/language-server-protocol#2018. (Also see microsoft/language-server-protocol#1802) * Update meta model * Add non-null falsy test * Change ApplyKind to ints * Tweaks + typos --------- Co-authored-by: Dirk Bäumer <[email protected]>
(Edit 2023-12-04: This request has changed slightly throughout the discussion - see #1802 (comment) below)
(from microsoft/vscode-languageserver-node#1237 (comment))
There's an
itemDefaults.data
field for completion that allowsdata
to be included once in a completion response rather than duplicated across all items.For Dart, the
data
field contains a mix of data that is the same for all items (eg. the file the completion is being inserted into so we can compute edits for addingimport
s where required - since/resolve
doesn't get any context) and data that is different (an ID to get back to the element being inserted so we can resolve things like documentation). SinceitemDefaults.data
replaces the whole ofdata
we can't use it here, so we end up with a large payload with a lot of duplicated into.It would be very helpful to have an option to merge
data
from items over the default (for ex.Object.assign(itemDefaults.data, item.data)
?).I'm happy to send PRs for this, but I want to agree an approach first:
data
or all fields?itemDefaults.mergedData
?)Object.assign(itemDefaults.data, item.data)
flexible enough? (you can usenull
to erase something from the defaults for a given item?)mergedData
) in thecompletionList.itemDefaults
set?@dbaeumer WDYT?
The text was updated successfully, but these errors were encountered: