Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completion filtering and sorting #898

Closed
astoff opened this issue Jan 22, 2020 · 70 comments
Closed

Completion filtering and sorting #898

astoff opened this issue Jan 22, 2020 · 70 comments
Labels
completion feature-request Request for new features or functionality

Comments

@astoff
Copy link

astoff commented Jan 22, 2020

The response to a completion request should specify whether or not the completion list has already been filtered and sorted by the server.

Client-side sorting and filtering should be virtually always unnecessary, but it is also often undesired. If the server sorts completion candidates by some relevance criterion rather than alphabetically and/or is capable of fuzzy matching, then there's no further useful (and plenty of harmful) processing the editor can do.

Even in the case of simple, non-fuzzy matching, client-side filtering may be unreliable. This is because there is no way for the server to inform the client about the completion prefix (cf. #648). It's easy to imagine a scenario where the editor discards all completion candidates because it disagrees with the server about what the prefix is (see e.g. joaotavora/eglot#402).

To provide a specific use-case, the digestif TeX server needs to return bogus filterText and sortText values to get fuzzy completion of citations to work. I am sure other servers out there do the same thing.

@joaotavora
Copy link

Also cf. #651

@sam-mccall
Copy link
Contributor

Clangd does this, and completion quality is way worse in VSCode compared to other clients once client-side caching/reranking kicks in.

Client-side caching is great for latency, and re-ranking is important for relevance. I've sketched a potential solution in #348. In brief:

  • the server would communicate a numeric score for each result that excluded any fuzzy-match considerations
  • when filtering, the client could compute a 0-1 "name-match" score, multiply it with the result score, and use the result for sorting.

@astoff
Copy link
Author

astoff commented Jan 24, 2020

the server would communicate a numeric score for each result

The completion list is, well, a list, so there is an intrinsic order. How could a numerical score convey more information than the relative position of each candidate, given that the editor can't access the server's scoring criteria? I think there's no further information beyond the candidate order that the editor can reliably operate with. See #348 (comment) for my reasoning.

Client-side caching is great for latency

LSP allows the server to attach a basically arbitrary textEdit to each completion item. This seems to preclude the possibility of the editor reusing cached information, barring some complicated (and potentially unreliable) manipulations to decide how to adapt an old textEdit to the document's current state.

Moreover, the kind of fuzzy matching I want for LaTeX citations just won't work if there is client-side caching. I can expand on this, but maybe the animated GIF here is enough.

In any case, this could be a third bit of information the server provides in the completion response: whether or not the client is advised to cache the completion candidates.

@mickaelistria
Copy link

I see many cases where a client-side sorting/filtering is more relevant than a client-side one; especially for "simpler" language servers.
For instance imagine a language that allows only one of thos words: a1, a2 and b1 from index 0 to index 2. If I want to implement completion easily but efficiently, I wouldn't even have to look at the current word and would always return those 3 items as completion, with TextEdits from index 0 to 2. Then the IDE can take care, if I complete b to filter and/or sort to show b1 first (or only) instead of keeping the initial order.
The realistic use-cases for that are numerous: XML/JSON element names, identifier completion and basically everything that we know the possible list of values is relatively small, and when when we just want to avoid putting such logic in the LS because we hope the client is able to deal with it.

But I also see many cases where the server sorting can be smarter, typically when it uses things like type or value inference to sort results (the IDE cannot do that, it's language logic inside the LS).
Some use-cases are typically strongly typed language, with completion that go beyond just an identified by can contain full expressions.
For example, if I have something like String s = "123"; int sInt=s, a completion engine can decide to return, in this order Integer.parseInt(s) then s.length() because some code analysis makes the 1st proposal more plausible. In this case, some client-side filtering/ordering would probably promote s.length() over Integer.parseInt(s) because textually, it seems a better heuristic (same prefix, small Levenstein distance...). But server-side semantic is superior to textual heuristics.

So I definitely agree there is a need for clarification in the spec to decide when client should filter/sort or when it should just keep results "as it".
However, I don't think the proposal with a "ranking" number would help as the structured is already an ordered list and confusion remains. Maybe we need something more explicit about sortText and filterText, like "if filterText is null, the element must always be shown and never filtered by the client" and "if any of the sortText is null, the client mustn't sort the list"

@astoff
Copy link
Author

astoff commented Feb 13, 2020

"if filterText is null, the element must always be shown and never filtered by the client"

That would also work. The alternative is to include a isFiltered property (boolean, optional) to the CompletionList type.

"if any of the sortText is null, the client mustn't sort the list"

This doesn't make much sense, since "sortedness" is a property of the completion list, not of an individual completion item. My proposal is to include a isSorted property (boolean, optional) to the CompletionList type.

@dbaeumer
Copy link
Member

I need to think about all these cases but I still believe that most of it should be possible using a combination of CompletionList.isComplete and sort text (ignoring the extract payload we have). If CompletionList.isComplete is set to false the client needs to trigger a new completion list request if the user types a character (client is not allowed to cache the result) allowing the server to return:

  • a different set of completion items
  • with different sort and filter properties

Can you provide me with use cases where this still results in missing expressiveness (minus the fact that we need to set the sortText property)

@astoff
Copy link
Author

astoff commented Feb 29, 2020

As discussed above, it would be convenient for the editor to know when it's advised not to cache and reuse completion results. Your proposed interpretation of the isComplete field would solve this, but not the other issues IMO.

If the server wants to trick the editor into not resorting, it's easy to come up with suitable sortText values (say 0001, 0002, 0003 and so on).

However, I don't see any reliable way to trick the editor into not filtering out candidates (which the server needs to do whenever it uses some fancy fuzzy-matching mechanism which is hard to replicate editor-side).

In any case, aren't "filteredness" and "sortedness" pretty fundamental properties of the completion list? I think they are, since I can't imagine another way to make fuzzy completions work. Assuming you agree, I don't see why you would want to avoid conveying this information directly, in favor of something hacky like sortText = "0001"?

PS: isComplete = false seems to mean that a more "complete" candidate list can be obtained upon request. Isn't this orthogonal to the present discussion? I would say the isComplete property is either important in itself and should mean exactly what it seems to mean, or it's not important enough and could be dropped from the spec.

@sam-mccall
Copy link
Contributor

@dbaeumer: that's indeed expressive enough. It does have a couple of downsides:

  • it's not (currently) obvious that it's the only correct way for clients to interpret the spec. For instance, my team focuses primarily on the clients VSCode, coc.nvim, and an internal-only client. Last I checked, all of them re-rank based on client-side name-match even when if we always set isComplete to false.
  • it completely disables the client-side filtering, which is valuable for responsiveness in common cases. (For example, I believe in vim flickering can't be completely avoided except through synchronous client-side filtering, and updating a list with multiple asynchronous sources is hairy in any editor)
  • semantically it's an abuse of the isComplete field, so it feels like a hack. Without explicit language in the spec that this is a valid use, it's hard to ask for support from clients. (Thus me filing here instead of against vscode)

If the spec was updated to explicitly mention this interaction (and VSCode followed it) that would work fine for us. I'm happy to send a PR with some text if that's useful.

@astoff
Copy link
Author

astoff commented Mar 1, 2020

semantically it's an abuse of the isComplete field

I just realized that what @dbaeumer described above is basically the semantic isIncomplete always had. From the spec: "This list it not complete. Further typing should result in recomputing this list.". It seems that doNotCache would have been a better name for this.

updating a list with multiple asynchronous sources is hairy

If merging the results from different servers is a supported use case of LSP, then setting a fabricated sortText to force a particular ordering on the client (e.g, sortText = "0001", etc.) is not expressive enough. It's not meaningful to compare fabricated values from different servers, or fabricated values with "honest" values.

I don't necessarily think this is important in practice, but it's one example of why abusing the sortText field is not a great thing to do.

that's indeed expressive enough

What's your procedure/workaround to avoid undesired client-side filtering?

@dbaeumer
Copy link
Member

dbaeumer commented Mar 2, 2020

I agree that the naming could have been better and I am willing to improve the documentation around it but it should be implemented in the way it is speced. If set to true the client is not allowed to reused an cached version from the server.

Whether the client filters or not will very likely depend how the completion items are generated. If the are replaces of the start of the method client might still filter. If they are generated for the current cursor position a client shouldn't filter.

@sam-mccall can you provide me with a concrete example where VS Code is working incorrectly in this regard.

@dbaeumer
Copy link
Member

dbaeumer commented Mar 2, 2020

@astoff

What's your procedure/workaround to avoid undesired client-side filtering?

IMO the following should work:

  • mark the list as isComplete= false
  • create the edits for the current cursor position

@Gama11
Copy link
Contributor

Gama11 commented Mar 2, 2020

I always thought that the ideal scenario is for completion to transition from isComplete = false to isComplete = true responses as fast as possible for performance reasons. So I was also under the impression that always keeping isComplete at false to prevent client-side filtering entirely would be considered an abuse of the API...

Maybe @jrieken could weigh in what the VSCode perspective is on this?

@astoff
Copy link
Author

astoff commented Mar 2, 2020

create the edits for the current cursor position

By that I understand that the completion items would be constructed so that 0 characters before the cursor need to be deleted.

This is incompatible with fuzzy-matching, since in this case what the user types is not literally what's supposed to be inserted.

Moreover, I think this is incompatible with most completion UIs — typically the completion box is placed so as to convey what the current "completion subject" is. See for instance this screenshot from Jedi's README page (here the completon subject is the l after the dot; the candidates are ljust, lower and lstrip, and not just, ower and strip):

image

@Gama11
Copy link
Contributor

Gama11 commented Mar 2, 2020

@astoff The label of completion items is independent of the text that should be inserted. You could have ljust as a label (shown in the UI), but just as an insert text.

@joaotavora
Copy link

This thread is becoming hard to follow. While I generally side with @astoff that there is insufficient support in the spec to properly support fuzzy-style completion (and other non-prefix matching strategies), I give the benefit of the doubt to those that disagree, at least until I fully understand their proposals.

@dbaeumer, you seem to be in the latter group. I am the author of an LSP client, but I have no clear idea how to implement your strategy based on existing fields to solve this problem. Here are two concrete questions:

  1. What signs should my client look for in a server's response to a :textDocument/completion to determine that the server is "fuzzily" completing something around the current client's cursor position?

  2. Most importantly, how can the client reliably and efficiently discover (if at all possible) what that something of point 1 is? In the past, you or someone alluded to using TextEdit objects for this, is this still your position?

I think everybody that, unlike me, actually has a concrete idea of how this could work should provide a minimal example of client-server communication that illustrates their case.

@dbaeumer
Copy link
Member

dbaeumer commented Mar 3, 2020

There are a couple if competing goals here and I do agree that the spec can be improved to describe what should happen when (especially if someone implements client which has no good support in the spec right now). So I try to explain what the thinking behind this is:

  • to achieve consistency across language and to honor different client filter models In LSP usually the client filters and sorts. This has also the advantage that client can experiment with different filter and sorting models
  • for speed clients should be able to filter an already received completion list if the user continues typing.

I do see that language want to opt out of this either always or in certain cases. They can do this today. The model we came up with is as follows.

Completion item provides an insertText / label without a text edit: in the model the client should filter against what the user has already typed using the word boundary rules of the language (e.g. resolving the word under the cursor position). The reason for this mode is that it makes it extremely easy for a server to implement a basic completion list and get it filtered on the client.

Completion Item with text edits: in this mode the server tells the client that it actually knows what it is doing. If you create a completion item with a text edit at the current cursor position no word guessing takes place and no filtering should happen. This mode can be combined with a sort text and filter text to customize two things. If the text edit can is a replace edit then the range denotes the word used for filtering. If the replace changes the text it most likely makes sense to specify a filter text to be used.

Here are some (hard coded) examples of these:

connection.onCompletion((params, token): CompletionItem[] => {
	const result: CompletionItem[] = [];
	let item = CompletionItem.create('foo');
	result.push(item);

	item = CompletionItem.create('foo-text');
	item.insertText = 'foo-text';
	result.push(item);

	item = CompletionItem.create('foo-text-range-insert');
	item.textEdit = TextEdit.insert(params.position, 'foo-text-range-insert');
	result.push(item);

	item = CompletionItem.create('foo-text-range-replace');
	item.textEdit = TextEdit.replace(
		Range.create(Position.create(params.position.line, params.position.character - 1), params.position),
		'foo-text-range-replace'
	);
	item.filterText = 'b';
	result.push(item);

	return result;
});

The first two are filter if the typed word doesn't match the text to be inserted. However the second two do since they specify edits. Since the last one is a backwards replacing completion item the filter text is set to the denoted word in the buffer (in the example hard coded to b).

capture

@astoff
Copy link
Author

astoff commented Mar 3, 2020

I realized now an important difference between VS Code and (most?) other editors.

VS Code provides no visual indication of what the "completion prefix" is. There's no way for the user to know in advance that selecting the fist item in the screenshot will delete the "b", and selecting the second won't. This allows each completion item to have a different prefix. Whether that's desirable is of course up to debate; "hippie expand" on Emacs works that way and it can be confusing.

Every other editor that I've seen (including Emacs's "normal" completion) require one single completion prefix for all candidates, among other reasons to decide where place the popup (cf. the Jedi-on-Vim screenshot from my previous comment).

In any case, apart from this possible incompatibility with most editors, the above interpretation of textEdit covers all use cases I had in mind.

@joaotavora
Copy link

joaotavora commented Mar 3, 2020

Thanks for the prompt answer, @dbaeumer

Completion item provides an insertText / label without a text edit: in the model the client should filter against what the user has already typed using the word boundary rules of the language (e.g. resolving the word under the cursor position).

It is very interesting that you mention this, and indeed, in my opinion, it is the crux of the difficulties I and @astoff are facing. How are clients and servers to agree on exactly what the "word boundaries" are? I think we already agree that this is an important agreement to have to resolve tooltip placing issues and completion colouring issues.

So shouldn't they communicate this between themselves, formally, in some manner? For some languages the "rules" are relatively obvious, for others, it isn't. And even if they are obvious, some servers will want to complete other syntactic constructs, not just "words". This is, I think, the case with @astoff 's server.

In other words, completion, in general, is the selection of one of many possible transformations over a region/range of text in the document. Often the text within that range (if there is any at all) is a fragment of what the user sees as a "word", but not always.

Furthermore, it must be either the server's job or the client's job to specify what that range is. They can't both decide. Currently, LSP leans that it should be the server taking that decision, which is perfectly OK with me for now. But it is arguable that users, via their client could want to make that decision too (this is for another issue though).

in this [textEdit-based] mode the server tells the client that it actually knows what it is doing.

True, but the server could, legitimately, be doing different things to different parts of the buffer for different completion items, can it not? Your example seems to demonstrate it. Therefore, the only way to safely resolve "word boundaries" issue is to go through all completion items and discover the smallest document range that contains every one of them.

While technically that might count as a possible solution to what I described in the first half of this post, it's not a very good one. A colleague, @nemethf is working on this approach (see joaotavora/eglot#402). At best it is laborious and at worst it is grossly impractical, including for performance reasons.

Isn't it much easier for the server to specify the boundaries in some other way? I see two solutions:

  1. A low-effort, low-impact solution could amount to rephrasing the spec to indicate the only the first completion item's textEdit region should be considered as the overarching region. Servers could then strategically bubble that item to the top, but only if they need to (if all of them have identical regions, there's no need).

  2. However, it is much simpler, in my view, to just add another optional item-independent range designation.

@jrieken
Copy link
Member

jrieken commented Mar 3, 2020

VS Code provides no visual indication of what the "completion prefix" is. There's no way for the user to know in advance that selecting the fist item in the screenshot will delete the "b", and selecting the second won't.

👇 those bold, blue highlights are completion prefixes. But as @dbaeumer said this is entirely up to extensions as the CompletionItem#range defines what the "completion prefix" for each completion is.

Screenshot 2020-03-03 at 16 59 53

@joaotavora
Copy link

those bold, blue highlights are completion prefixes.

While that is true to your definition of a "completion prefix", is there a way for the user to know in general, before hand what a completion item will do to his text? That's what @astoff meant, I think. I think your observation does not answer his complaint.

In my opinion, the only way to improve the situation in the Editor, visually and summarily, is to mark the overarching region that covers all completions.

Also I note:

  • that the completion tooltip seems oddly placed. In this simple case it should be placed four character lengths to the left;

  • that it seems challenging to think how one would approach "fuzzy matching". What if the cursor was after the k in mrkdown instead of markdown? Or what if the server wanted to complete markdown.uris to markdown.urinals, for example? How would Visual Code negotiate this information with the server and what visual feedback would it provide to the user?

@kdvolder
Copy link

kdvolder commented Mar 3, 2020

n my opinion, the only way to improve the situation in the Editor, visually and summarily, is to mark the overarching region that covers all completions.

Tend to agree. However what if there are multiple servers offering completions for the current cursor, how do you get all servers on the same page? Should that even be a requirement? If not then you are sort of back to square one on dealing with the fact that not all completions may agree on the 'prefix'.

@joaotavora
Copy link

joaotavora commented Mar 3, 2020

However what if there are multiple servers offering completions for the current cursor, how do you get all servers on the same page?

Fair point, indeed. In that situation, it's certainly much easier for the client to calculate the minimal range for a (presumably small) number of servers than for a large number of completions within each server.

@jrieken
Copy link
Member

jrieken commented Mar 3, 2020

Tend to agree. However what if there are multiple servers offering completions for the current cursor, how do you get all servers on the same page?

You don't need multiple services for this problem. Take the following JavaScript sample

class Foo {

  size = 4;

  soo() { }

  ["sö sö"]() {}
}

The Foo type has "normal-named" members and a member with a "computed name". Now, trigger IntelliSense for member completion (| is the cursor):

new Foo().|

There will be three suggestions, size, soo, and sö sö. When accepting the first two you can simply perform an insert but when accepting the last you need to perform a replace of the dot, e.g it must become new Foo()['sö sö']|. Cases like this is the reason that each suggestion can have its own "completion prefix" and I don't really know how other editors are doing it.

Mar-03-2020 18-11-16

Not so long ago, I have actually experimented with showing the completion prefix dynamically as you go through the completions list but that was perceived as unhelpful noise.

@yyoncho
Copy link

yyoncho commented Mar 3, 2020

that it seems challenging to think how one would approach "fuzzy matching". What if the cursor was after the k in mrkdown instead of markdown? Or what if the server wanted to complete markdown.uris to markdown.urinals, for example? How would Visual Code negotiate this information with the server and what visual feedback would it provide to the user?

The server sends textEdit range which answers all the questions - for each item you should use the range textEdit.range.start to the current point and perform fuzzy/partial/etc filtering against filterText.

Fair point, indeed. In that situation, it's certainly much easier for the client to calculate the minimal range for a (presumably small) number of servers than for a large number of completions within each server.

Why do you need that?

@dbaeumer

typed using the word boundary rules of the language (e.g. resolving the word under the cursor position).

The funny thing here is that do the servers does resolve to the common-sense word boundaries but to the word boundaries as defined by VScode no matter how inconsistent they are. Just to give you an example: in HTML when you have <tag the word boundary is tag but in XML when you have <tag the word boundary is <tag(or it was the opposite). So in the past in order to have that working, we at emacs side were forced to replicate that bizarre behaviour to have the proper filtering. The good news is that most of the servers now provide textEdit. Although we now at emacs side have that working but not using what you have described but the algorithm from lsp4e IMO it will be much better if you deprecate insertText and force all servers to use only the textEdit property.

@astoff
Copy link
Author

astoff commented Mar 3, 2020

@jrieken I don't see any incompatibility between this use-case and the server specifying explicitly a "completion prefix" in its response. The completion prefix is for UI purposes and, if applicable, client-side filtering.

I think your use case is a fantastic example of why the textEdit should be only and exclusively used for text edits to be applied to the document upon the selection of candidate; and not to glean any additional information about the completion context.

@joaotavora
Copy link

joaotavora commented Mar 3, 2020

Cases like this is the reason that each suggestion can have it own "completion prefix" and I don't really know how other editors are doing it.

@jrieken , we might be miscommunicating. I am not proposing that editors give up this functionality. Indeed, my client supports it. In your example, the range being targeted once you input the s is the range that contains just the s. VSCode is guessing correctly and agrees with the server. There is no problem.

But you could have a "computed named" (to use your wording) that targets some more text before or after that s. VSCode could guess, but it would probably make a mistake. To avoid doing that mistake, it would need to reliably determine the range of potentially affected text by any completion.

As I acknowledged, it would not be impossible with the current protocol, but impractical and inefficient. That is because the full set of completions has to be traversed (and that's not a set of 3 items as in our toy examples here).

Therefore, because the server already has this knowledge beforehand (it was its decision after all), I merely propose that the server share it with the client, optionally, so that the client might make use of that knowledge to provide a potentially more informing UI.

If more servers are in play, and they take independent decisions, then we have to merge those decisions. But that's much easier than merging the individual completion's ranges.

@joaotavora
Copy link

Why do you need that?

@yyoncho #898 (comment)

@sam-mccall
Copy link
Contributor

sam-mccall commented Mar 3, 2020

@dbaeumer Thanks for digging into this!

can you provide me with a concrete example where VS Code is working incorrectly in this regard.

I modified clangd trunk to always set isIncomplete to true to reproduce this.
TL;DR: VSCode issues a request per keystroke, but reranks the results itself.


Using the following code, with the cursor starting at the ^.
(The deprecated attribute is a simple way to get clangd to rank the item lower, it's not the motivating case. EDIT: changing the type of abc to char* has the same effect, and is a better example)

int a_b_c;
[[deprecated]] int abc;
int y = ^

When I type a, I get the following completion request/response:

V[18:37:12.986] <<< {
  "id": 3,
  "jsonrpc": "2.0",
  "method": "textDocument/completion",
  "params": {
    "context": {
      "triggerKind": 1
    },
    "position": {
      "character": 9,
      "line": 2
    },
    "textDocument": {
      "uri": "file:///home/sammccall/test.cc"
    }
  }
}

V[18:37:12.991] >>> {
  "id": 3,
  "jsonrpc": "2.0",
  "result": {
    "isIncomplete": true,
    "items": [
      {
        "detail": "int",
        "filterText": "a_b_c",
        "insertText": "a_b_c",
        "insertTextFormat": 2,
        "kind": 6,
        "label": " a_b_c",
        "score": 33,
        "sortText": "3dfc0000a_b_c",
        "textEdit": {
          "newText": "a_b_c",
          "range": {
            "end": {
              "character": 9,
              "line": 2
            },
            "start": {
              "character": 8,
              "line": 2
            }
          }
        }
      },
      {
        "label": " alignof(type)",
        /*details elided*/
      },
      {
        "label": " auto",
        /*details elided*/
      },
      {
        "deprecated": true,
        "detail": "int",
        "filterText": "abc",
        "insertText": "abc",
        "insertTextFormat": 2,
        "kind": 6,
        "label": " abc",
        "score": 3.3000001907348633,
        "sortText": "3facccccabc",
        "textEdit": {
          "newText": "abc",
          "range": {
            "end": {
              "character": 9,
              "line": 2
            },
            "start": {
              "character": 8,
              "line": 2
            }
          }
        }
      }
    ]
  }
}

Here the sortText says that a_b_c should be listed above ab, and indeed it is in VSCode. So far so good!

Now I type b (so the line is int y = ab^ with cursor at ^). I get the following request/response:

V[18:37:14.732] <<< {
  "id": 7,
  "jsonrpc": "2.0",
  "method": "textDocument/completion",
  "params": {
    "context": {
      "triggerKind": 3
    },
    "position": {
      "character": 10,
      "line": 2
    },
    "textDocument": {
      "uri": "file:///home/sammccall/test.cc"
    }
  }
}

V[18:37:14.735] >>> {
  "id": 7,
  "jsonrpc": "2.0",
  "result": {
    "isIncomplete": true,
    "items": [
      {
        "detail": "int",
        "filterText": "a_b_c",
        "insertText": "a_b_c",
        "insertTextFormat": 2,
        "kind": 6,
        "label": " a_b_c",
        "score": 33,
        "sortText": "3e3a0000a_b_c",
        "textEdit": {
          "newText": "a_b_c",
          "range": {
            "end": {
              "character": 10,
              "line": 2
            },
            "start": {
              "character": 8,
              "line": 2
            }
          }
        }
      },
      {
        "deprecated": true,
        "detail": "int",
        "filterText": "abc",
        "insertText": "abc",
        "insertTextFormat": 2,
        "kind": 6,
        "label": " abc",
        "score": 3.3000001907348633,
        "sortText": "3facccccabc",
        "textEdit": {
          "newText": "abc",
          "range": {
            "end": {
              "character": 10,
              "line": 2
            },
            "start": {
              "character": 8,
              "line": 2
            }
          }
        }
      }
    ]
  }
}

Again, clangd indicates a_b_c should be ranked above abc.
However, vscode shows abc first and a_b_c below it. My interpretation is that VSCode is ranking these using its own fuzzy-matching, with sortText used only to break ties.

So forcing isIncomplete does convince VSCode to issue a request on each keystroke, but fails to actually use the resulting ranking, which was the real goal.


Clangd implementation details, motivating but not directly relevant to VSCode:

  • Internally, clangd does penalize a_b_c slightly for no longer being an exact prefix match, as you can see from the changed sortText, but overall this is outweighed by deprecation signal which clangd weights heavily
  • the score field visible in the output is a clangd extension which reflects the "intrinsic" score without any name-fuzzy-matching applied. It's not used in VSCode.
  • the sortText is an encoding of (-score * nameScore, filterText) that preserves order.

@dbaeumer dbaeumer added feature-request Request for new features or functionality completion labels Nov 11, 2020
@dbaeumer dbaeumer added this to the Backlog milestone Nov 11, 2020
astoff referenced this issue in joaotavora/eglot May 1, 2021
Setting completion-styles buffer-locally is harder to customize and
can break some completion UIs.

Emacs bug#48073

* eglot.el: Add a completion-category-defaults entry, if applicable.
(eglot--managed-mode): Don't set `completion-styles'
(eglot-completion-at-point): Add style metadata to
completion table.
@DavidGoldman
Copy link

Friendly ping, is there any possibility in getting this implemented? Would contributions be welcome?

@dbaeumer
Copy link
Member

I have to admit I lost track of this. Would someone be able to summaries what the current proposal is. Even if we had this to LSP I think it needs to go behind a client capability and not all clients might support turning sorting and filtering off.

@dbaeumer
Copy link
Member

The feature has not gain any traction. I therefore close the issue. I someone is able to summaries the current proposal and is willing to work on this I am happy to reopen it.

Happy Coding!

@dbaeumer
Copy link
Member

I have updated the spec with the text from this comment: #898 (comment)

@astoff
Copy link
Author

astoff commented Oct 28, 2021

I have updated the spec with the text from this comment: #898 (comment)

The subsequent comment by João #898 (comment) pointed out various problems with that idea. IMO the very last line of that comment provides the obvious solution to the problem at hand.

@Mehdish1
Copy link

Mehdish1 commented Oct 28, 2021 via email

@dbaeumer
Copy link
Member

Regarding word boundaries: there is still #937 which I think we should try to address since it is necessary for other things as well.

Adding yet another range makes IMO only sense if the completion items only use insertText and not textEdits. And that should only be supported on CompletionList and not on the item itself.

@dbaeumer dbaeumer removed this from the Backlog milestone Nov 2, 2021
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
Summary:
Clangd's approach is to provide lots of completions, and let ranking sort them
out. This relies on various important signals (Quality.h), without which the
large completion lists are extremely spammy.

Even with a completion result exactly at the cursor, vscode looks backwards and
tries to match the presumed partial-identifier against filterText, and uses
the result to rank, with sortText only used as a tiebreak.
By prepending the partial-identifier to the filterText, we can force the match
to be perfect and so give sortText full control of the ranking.

Full sad story: microsoft/language-server-protocol#898

It's possible to do this on the server side too of course, and switch it on
with an initialization option. But it's a little easier in the extension, it
will get the fix to users of old clangd versions, and other editors

Reviewers: hokein

Reviewed By: hokein

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75623
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
completion feature-request Request for new features or functionality
Projects
None yet
Development

No branches or pull requests