-
Notifications
You must be signed in to change notification settings - Fork 826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Completion filtering and sorting #898
Comments
Also cf. #651 |
Clangd does this, and completion quality is way worse in VSCode compared to other clients once client-side caching/reranking kicks in. Client-side caching is great for latency, and re-ranking is important for relevance. I've sketched a potential solution in #348. In brief:
|
The completion list is, well, a list, so there is an intrinsic order. How could a numerical score convey more information than the relative position of each candidate, given that the editor can't access the server's scoring criteria? I think there's no further information beyond the candidate order that the editor can reliably operate with. See #348 (comment) for my reasoning.
LSP allows the server to attach a basically arbitrary Moreover, the kind of fuzzy matching I want for LaTeX citations just won't work if there is client-side caching. I can expand on this, but maybe the animated GIF here is enough. In any case, this could be a third bit of information the server provides in the completion response: whether or not the client is advised to cache the completion candidates. |
I see many cases where a client-side sorting/filtering is more relevant than a client-side one; especially for "simpler" language servers. But I also see many cases where the server sorting can be smarter, typically when it uses things like type or value inference to sort results (the IDE cannot do that, it's language logic inside the LS). So I definitely agree there is a need for clarification in the spec to decide when client should filter/sort or when it should just keep results "as it". |
That would also work. The alternative is to include a
This doesn't make much sense, since "sortedness" is a property of the completion list, not of an individual completion item. My proposal is to include a |
I need to think about all these cases but I still believe that most of it should be possible using a combination of
Can you provide me with use cases where this still results in missing expressiveness (minus the fact that we need to set the sortText property) |
As discussed above, it would be convenient for the editor to know when it's advised not to cache and reuse completion results. Your proposed interpretation of the If the server wants to trick the editor into not resorting, it's easy to come up with suitable However, I don't see any reliable way to trick the editor into not filtering out candidates (which the server needs to do whenever it uses some fancy fuzzy-matching mechanism which is hard to replicate editor-side). In any case, aren't "filteredness" and "sortedness" pretty fundamental properties of the completion list? I think they are, since I can't imagine another way to make fuzzy completions work. Assuming you agree, I don't see why you would want to avoid conveying this information directly, in favor of something hacky like PS: |
@dbaeumer: that's indeed expressive enough. It does have a couple of downsides:
If the spec was updated to explicitly mention this interaction (and VSCode followed it) that would work fine for us. I'm happy to send a PR with some text if that's useful. |
I just realized that what @dbaeumer described above is basically the semantic
If merging the results from different servers is a supported use case of LSP, then setting a fabricated I don't necessarily think this is important in practice, but it's one example of why abusing the
What's your procedure/workaround to avoid undesired client-side filtering? |
I agree that the naming could have been better and I am willing to improve the documentation around it but it should be implemented in the way it is speced. If set to true the client is not allowed to reused an cached version from the server. Whether the client filters or not will very likely depend how the completion items are generated. If the are replaces of the start of the method client might still filter. If they are generated for the current cursor position a client shouldn't filter. @sam-mccall can you provide me with a concrete example where VS Code is working incorrectly in this regard. |
IMO the following should work:
|
I always thought that the ideal scenario is for completion to transition from Maybe @jrieken could weigh in what the VSCode perspective is on this? |
By that I understand that the completion items would be constructed so that 0 characters before the cursor need to be deleted. This is incompatible with fuzzy-matching, since in this case what the user types is not literally what's supposed to be inserted. Moreover, I think this is incompatible with most completion UIs — typically the completion box is placed so as to convey what the current "completion subject" is. See for instance this screenshot from Jedi's README page (here the completon subject is the |
@astoff The |
This thread is becoming hard to follow. While I generally side with @astoff that there is insufficient support in the spec to properly support fuzzy-style completion (and other non-prefix matching strategies), I give the benefit of the doubt to those that disagree, at least until I fully understand their proposals. @dbaeumer, you seem to be in the latter group. I am the author of an LSP client, but I have no clear idea how to implement your strategy based on existing fields to solve this problem. Here are two concrete questions:
I think everybody that, unlike me, actually has a concrete idea of how this could work should provide a minimal example of client-server communication that illustrates their case. |
There are a couple if competing goals here and I do agree that the spec can be improved to describe what should happen when (especially if someone implements client which has no good support in the spec right now). So I try to explain what the thinking behind this is:
I do see that language want to opt out of this either always or in certain cases. They can do this today. The model we came up with is as follows. Completion item provides an insertText / label without a text edit: in the model the client should filter against what the user has already typed using the word boundary rules of the language (e.g. resolving the word under the cursor position). The reason for this mode is that it makes it extremely easy for a server to implement a basic completion list and get it filtered on the client. Completion Item with text edits: in this mode the server tells the client that it actually knows what it is doing. If you create a completion item with a text edit at the current cursor position no word guessing takes place and no filtering should happen. This mode can be combined with a sort text and filter text to customize two things. If the text edit can is a replace edit then the range denotes the word used for filtering. If the replace changes the text it most likely makes sense to specify a filter text to be used. Here are some (hard coded) examples of these: connection.onCompletion((params, token): CompletionItem[] => {
const result: CompletionItem[] = [];
let item = CompletionItem.create('foo');
result.push(item);
item = CompletionItem.create('foo-text');
item.insertText = 'foo-text';
result.push(item);
item = CompletionItem.create('foo-text-range-insert');
item.textEdit = TextEdit.insert(params.position, 'foo-text-range-insert');
result.push(item);
item = CompletionItem.create('foo-text-range-replace');
item.textEdit = TextEdit.replace(
Range.create(Position.create(params.position.line, params.position.character - 1), params.position),
'foo-text-range-replace'
);
item.filterText = 'b';
result.push(item);
return result;
}); The first two are filter if the typed word doesn't match the text to be inserted. However the second two do since they specify edits. Since the last one is a backwards replacing completion item the filter text is set to the denoted word in the buffer (in the example hard coded to |
I realized now an important difference between VS Code and (most?) other editors. VS Code provides no visual indication of what the "completion prefix" is. There's no way for the user to know in advance that selecting the fist item in the screenshot will delete the "b", and selecting the second won't. This allows each completion item to have a different prefix. Whether that's desirable is of course up to debate; "hippie expand" on Emacs works that way and it can be confusing. Every other editor that I've seen (including Emacs's "normal" completion) require one single completion prefix for all candidates, among other reasons to decide where place the popup (cf. the Jedi-on-Vim screenshot from my previous comment). In any case, apart from this possible incompatibility with most editors, the above interpretation of |
Thanks for the prompt answer, @dbaeumer
It is very interesting that you mention this, and indeed, in my opinion, it is the crux of the difficulties I and @astoff are facing. How are clients and servers to agree on exactly what the "word boundaries" are? I think we already agree that this is an important agreement to have to resolve tooltip placing issues and completion colouring issues. So shouldn't they communicate this between themselves, formally, in some manner? For some languages the "rules" are relatively obvious, for others, it isn't. And even if they are obvious, some servers will want to complete other syntactic constructs, not just "words". This is, I think, the case with @astoff 's server. In other words, completion, in general, is the selection of one of many possible transformations over a region/range of text in the document. Often the text within that range (if there is any at all) is a fragment of what the user sees as a "word", but not always. Furthermore, it must be either the server's job or the client's job to specify what that range is. They can't both decide. Currently, LSP leans that it should be the server taking that decision, which is perfectly OK with me for now. But it is arguable that users, via their client could want to make that decision too (this is for another issue though).
True, but the server could, legitimately, be doing different things to different parts of the buffer for different completion items, can it not? Your example seems to demonstrate it. Therefore, the only way to safely resolve "word boundaries" issue is to go through all completion items and discover the smallest document range that contains every one of them. While technically that might count as a possible solution to what I described in the first half of this post, it's not a very good one. A colleague, @nemethf is working on this approach (see joaotavora/eglot#402). At best it is laborious and at worst it is grossly impractical, including for performance reasons. Isn't it much easier for the server to specify the boundaries in some other way? I see two solutions:
|
👇 those bold, blue highlights are completion prefixes. But as @dbaeumer said this is entirely up to extensions as the |
While that is true to your definition of a "completion prefix", is there a way for the user to know in general, before hand what a completion item will do to his text? That's what @astoff meant, I think. I think your observation does not answer his complaint. In my opinion, the only way to improve the situation in the Editor, visually and summarily, is to mark the overarching region that covers all completions. Also I note:
|
Tend to agree. However what if there are multiple servers offering completions for the current cursor, how do you get all servers on the same page? Should that even be a requirement? If not then you are sort of back to square one on dealing with the fact that not all completions may agree on the 'prefix'. |
Fair point, indeed. In that situation, it's certainly much easier for the client to calculate the minimal range for a (presumably small) number of servers than for a large number of completions within each server. |
You don't need multiple services for this problem. Take the following JavaScript sample class Foo {
size = 4;
soo() { }
["sö sö"]() {}
} The
There will be three suggestions, Not so long ago, I have actually experimented with showing the completion prefix dynamically as you go through the completions list but that was perceived as unhelpful noise. |
The server sends
Why do you need that?
The funny thing here is that do the servers does resolve to the common-sense word boundaries but to the word boundaries as defined by VScode no matter how inconsistent they are. Just to give you an example: in HTML when you have |
@jrieken I don't see any incompatibility between this use-case and the server specifying explicitly a "completion prefix" in its response. The completion prefix is for UI purposes and, if applicable, client-side filtering. I think your use case is a fantastic example of why the |
@jrieken , we might be miscommunicating. I am not proposing that editors give up this functionality. Indeed, my client supports it. In your example, the range being targeted once you input the But you could have a "computed named" (to use your wording) that targets some more text before or after that As I acknowledged, it would not be impossible with the current protocol, but impractical and inefficient. That is because the full set of completions has to be traversed (and that's not a set of 3 items as in our toy examples here). Therefore, because the server already has this knowledge beforehand (it was its decision after all), I merely propose that the server share it with the client, optionally, so that the client might make use of that knowledge to provide a potentially more informing UI. If more servers are in play, and they take independent decisions, then we have to merge those decisions. But that's much easier than merging the individual completion's ranges. |
|
@dbaeumer Thanks for digging into this!
I modified clangd trunk to always set Using the following code, with the cursor starting at the int a_b_c;
[[deprecated]] int abc;
int y = ^ When I type V[18:37:12.986] <<< {
"id": 3,
"jsonrpc": "2.0",
"method": "textDocument/completion",
"params": {
"context": {
"triggerKind": 1
},
"position": {
"character": 9,
"line": 2
},
"textDocument": {
"uri": "file:///home/sammccall/test.cc"
}
}
}
V[18:37:12.991] >>> {
"id": 3,
"jsonrpc": "2.0",
"result": {
"isIncomplete": true,
"items": [
{
"detail": "int",
"filterText": "a_b_c",
"insertText": "a_b_c",
"insertTextFormat": 2,
"kind": 6,
"label": " a_b_c",
"score": 33,
"sortText": "3dfc0000a_b_c",
"textEdit": {
"newText": "a_b_c",
"range": {
"end": {
"character": 9,
"line": 2
},
"start": {
"character": 8,
"line": 2
}
}
}
},
{
"label": " alignof(type)",
/*details elided*/
},
{
"label": " auto",
/*details elided*/
},
{
"deprecated": true,
"detail": "int",
"filterText": "abc",
"insertText": "abc",
"insertTextFormat": 2,
"kind": 6,
"label": " abc",
"score": 3.3000001907348633,
"sortText": "3facccccabc",
"textEdit": {
"newText": "abc",
"range": {
"end": {
"character": 9,
"line": 2
},
"start": {
"character": 8,
"line": 2
}
}
}
}
]
}
} Here the Now I type V[18:37:14.732] <<< {
"id": 7,
"jsonrpc": "2.0",
"method": "textDocument/completion",
"params": {
"context": {
"triggerKind": 3
},
"position": {
"character": 10,
"line": 2
},
"textDocument": {
"uri": "file:///home/sammccall/test.cc"
}
}
}
V[18:37:14.735] >>> {
"id": 7,
"jsonrpc": "2.0",
"result": {
"isIncomplete": true,
"items": [
{
"detail": "int",
"filterText": "a_b_c",
"insertText": "a_b_c",
"insertTextFormat": 2,
"kind": 6,
"label": " a_b_c",
"score": 33,
"sortText": "3e3a0000a_b_c",
"textEdit": {
"newText": "a_b_c",
"range": {
"end": {
"character": 10,
"line": 2
},
"start": {
"character": 8,
"line": 2
}
}
}
},
{
"deprecated": true,
"detail": "int",
"filterText": "abc",
"insertText": "abc",
"insertTextFormat": 2,
"kind": 6,
"label": " abc",
"score": 3.3000001907348633,
"sortText": "3facccccabc",
"textEdit": {
"newText": "abc",
"range": {
"end": {
"character": 10,
"line": 2
},
"start": {
"character": 8,
"line": 2
}
}
}
}
]
}
} Again, clangd indicates So forcing Clangd implementation details, motivating but not directly relevant to VSCode:
|
Setting completion-styles buffer-locally is harder to customize and can break some completion UIs. Emacs bug#48073 * eglot.el: Add a completion-category-defaults entry, if applicable. (eglot--managed-mode): Don't set `completion-styles' (eglot-completion-at-point): Add style metadata to completion table.
Friendly ping, is there any possibility in getting this implemented? Would contributions be welcome? |
I have to admit I lost track of this. Would someone be able to summaries what the current proposal is. Even if we had this to LSP I think it needs to go behind a client capability and not all clients might support turning sorting and filtering off. |
The feature has not gain any traction. I therefore close the issue. I someone is able to summaries the current proposal and is willing to work on this I am happy to reopen it. Happy Coding! |
I have updated the spec with the text from this comment: #898 (comment) |
The subsequent comment by João #898 (comment) pointed out various problems with that idea. IMO the very last line of that comment provides the obvious solution to the problem at hand. |
در پنجشنبه ۲۸ اکتبر ۲۰۲۱ در ۱۸:۴۸ Augusto Stoffel ***@***.***>
نوشت:
I have updated the spec with the text from this comment: #898
<#898>
(comment)
The subsequent comment by João #898 (comment)
<#898 (comment)>
pointed out various problems with that idea. IMO the very last line of that
comment provides the obvious solution to the problem at hand.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#898 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AU3VN7B75EW3PRNMJ635E43UJFZVNANCNFSM4KKF2XSA>
.
--
null
|
Regarding word boundaries: there is still #937 which I think we should try to address since it is necessary for other things as well. Adding yet another range makes IMO only sense if the completion items only use |
Summary: Clangd's approach is to provide lots of completions, and let ranking sort them out. This relies on various important signals (Quality.h), without which the large completion lists are extremely spammy. Even with a completion result exactly at the cursor, vscode looks backwards and tries to match the presumed partial-identifier against filterText, and uses the result to rank, with sortText only used as a tiebreak. By prepending the partial-identifier to the filterText, we can force the match to be perfect and so give sortText full control of the ranking. Full sad story: microsoft/language-server-protocol#898 It's possible to do this on the server side too of course, and switch it on with an initialization option. But it's a little easier in the extension, it will get the fix to users of old clangd versions, and other editors Reviewers: hokein Reviewed By: hokein Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75623
The response to a completion request should specify whether or not the completion list has already been filtered and sorted by the server.
Client-side sorting and filtering should be virtually always unnecessary, but it is also often undesired. If the server sorts completion candidates by some relevance criterion rather than alphabetically and/or is capable of fuzzy matching, then there's no further useful (and plenty of harmful) processing the editor can do.
Even in the case of simple, non-fuzzy matching, client-side filtering may be unreliable. This is because there is no way for the server to inform the client about the completion prefix (cf. #648). It's easy to imagine a scenario where the editor discards all completion candidates because it disagrees with the server about what the prefix is (see e.g. joaotavora/eglot#402).
To provide a specific use-case, the digestif TeX server needs to return bogus
filterText
andsortText
values to get fuzzy completion of citations to work. I am sure other servers out there do the same thing.The text was updated successfully, but these errors were encountered: