Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve heuristic related to showing completion labels #2189

Closed
rchl opened this issue Feb 7, 2023 · 21 comments
Closed

Improve heuristic related to showing completion labels #2189

rchl opened this issue Feb 7, 2023 · 21 comments

Comments

@rchl
Copy link
Member

rchl commented Feb 7, 2023

Describe the bug

typescript-language-server will soon introduce filterText property which will make ST presentation of completions worse. See typescript-language-server/typescript-language-server#678 (review)

The change will be spec compliant and might improve things for other editors but it's not gonna work great for us so we need to improve our heuristic.

This issue also already manifests with Volar, for example. See:

Screenshot 2023-02-07 at 13 57 11

@rchl
Copy link
Member Author

rchl commented Feb 7, 2023

Example payloads:

{
  "commitCharacters": [
    ".",
    ",",
    ";",
    "("
  ],
  "filterText": ". $attrs",
  "insertText": ". $attrs",
  "insertTextFormat": 1,
  "kind": 5,
  "label": "$attrs",
  "sortText": "11",
  "textEdit": {
    "newText": " $attrs",
    "range": {
      "end": {
        "character": 19,
        "line": 7
      },
      "start": {
        "character": 18,
        "line": 7
      }
    }
  }
}
{
  "commitCharacters": [
    ".",
    ",",
    "("
  ],
  "detail": "typescript",
  "filterText": "import { readConfigFile$1 } from 'typescript';",
  "insertText": "import { readConfigFile$1 } from 'typescript';",
  "insertTextFormat": 2,
  "kind": 3,
  "label": "readConfigFile",
  "sortText": "11",
  "textEdit": {
    "insert": {
      "end": {
        "character": 18,
        "line": 7
      },
      "start": {
        "character": 0,
        "line": 7
      }
    },
    "newText": "import { readConfigFile$1 } from 'typescript';",
    "replace": {
      "end": {
        "character": 18,
        "line": 7
      },
      "start": {
        "character": 0,
        "line": 7
      }
    }
  }
}

@jwortmann
Copy link
Member

An initial idea from my side would be if (" " + label) in filterText prefer label. Because ST doesn't match/filter over multiple words anyway.

@rchl
Copy link
Member Author

rchl commented Feb 7, 2023

Could we dumbify it to just check for a space in filterText?

I'm thinking that in case of that specific typescript example, in theory the filterText could include text without spaces within the braces and then your suggestion wouldn't work. Like: "filterText": "import {readConfigFile$1} from 'typescript';"

@jwortmann
Copy link
Member

jwortmann commented Feb 10, 2023

I'm thinking about what would be the reason why a server would use a filterText which is different from label, like in the examples above, in the way that filterText is not a substring of label. That would mean that a server intends to keep the completion open/available if the user types something from the filterText, which is not shown in the label, right? For example if there is another "word completion" from the buffer with only readConfigFile, then you could type imreadconf or readconfts to ensure that the "import" completion is inserted, and not just the single word. Or if there were for example another completion with a filterText of import { readConfigFile$1 } from 'javascript';, then you could type readconfjs to get that one with "javascript" (I don't know if this specific example makes sense, but I'm just trying to understand the intention in general).

That's why we have the lsp_label.startswith(lsp_filter_text) right now, to make sure that no information are lost when matching/filtering the completions, and I think it makes sense. If we ignore the filterText just because it contains a space, then typing those examples like readconfts wouldn't work anymore, and you need to hope that you picked the correct completion item when there are several items with the same label. For example I use "inhibit_word_completions": false, so there is likely a word completion from ST with the same label. Or another example where I found the completion behavior annoying, are the "auto-import" completions from Pyright, which also only show a bare single-word label, often with multiple other completion item with that same label (I turned off the auto-imports via "python.analysis.autoImportCompletions": false for that reason).

For you first example with the screenshot and the leading . , I don't really see a reason why a server would do that, and I think it's stupid to provide a completion in that way.

VSCode doesn't have this problem, because it can still filter the items even if you type something which is not contained in the label. But ST can't do this, and actually what I wrote in #2189 (comment) is wrong and ST does match over multiple words separated by spaces (in the comment above I accidentally thought about what happens when you write a space, which is of course not really relevant):
completion

So I'm doubtful that even in my suggestion from above the "more beautiful" label would outweight the drawbacks in behavior.

@rchl
Copy link
Member Author

rchl commented Feb 10, 2023

For you first example with the screenshot and the leading . , I don't really see a reason why a server would do that, and I think it's stupid to provide a completion in that way.

There is more context in microsoft/vscode#63100

As far as I understand, VSCode scores completion higher when filterText matches (starts with?) the text that user typed. So if user typed . to trigger completions then filterText with leading . will be scored higher. So they are forcing filterText to have leading . in some cases.

(In case of Volar the extra space after . seems like a unnecessary side-effect of Volar doing some code transformations when triggering completions from the <template> tag but otherwise the idea is the same.)

So based on that info, it looks like the purpose of . $attrs is not to improve filtering but sorting. So it would be fine if we didn't show the filterText in this specific case.

As for filterText like import { readConfigFile$1 } from 'typescript';, that can make a difference when used for filtering instead of label...

@jwortmann
Copy link
Member

As far as I understand, VSCode scores completion higher when filterText matches (starts with?) the text that user typed. So if user typed . to trigger completions then filterText with leading . will be scored higher. So they are forcing filterText to have leading . in some cases.

But the server already knows from the content of the file and the position of the completion request, what the completion prefix is, i.e. in the linked VSCode issue that it is a dot. So to prepend a . in the filterText is useless, because it would just apply to all completions. And to (ab)use the filterText for sorting by prepending the . only to some of the completions also seems like a bad misuse to me. They should use sortText instead, because sortText is for sorting and filterText is for filtering. Also in the screenshot from your opening post, all of the completion items seem to have this dot (?)... and the dot is not even present in the file 🤷

Under "filtering" I understand that if you type a letter which is not present in the filterText and in the right order, the corresponding item gets hidden (for example if you typed rn, it could show two completions range and random, and if you then add a g, it will only keep range because rng doesn't match random anymore) That is what ST does with the "trigger". And if we make the "trigger" to be only a smaller part of the filterText, then some items that actually would match, could accidentally be hidden. The current startswith() logic ensures that this won't happen and we are on the safe side.

(In case of Volar the extra space after . seems like a unnecessary side-effect of Volar doing some code transformations when triggering completions from the <template> tag but otherwise the idea is the same.)

So based on that info, it looks like the purpose of . $attrs is not to improve filtering but sorting. So it would be fine if we didn't show the filterText in this specific case.

But then I would say that, if a server like Volar decides to ignore the purpose of the different CompletionItem fields, the "fix" for that should be in the LSP-Volar package, not in the LSP package.

@rchl
Copy link
Member Author

rchl commented Feb 11, 2023

I've been discussing this more in typescript-language-server/typescript-language-server#631 but the bottom line is that I'm fairly convinced now that what vscode is doing with filterText is not right. At least for other clients.

@rwols
Copy link
Member

rwols commented Feb 11, 2023

This issue has appeared before with gopls. It used the same string for all filterText fields to influence VSCode’s sorting behavior from what I can recall.

@rchl
Copy link
Member Author

rchl commented Feb 11, 2023

Are you able to dig out some more info (issues)? More data points could help to make the right decision here.

EDIT: Only able to find #1002

@rchl
Copy link
Member Author

rchl commented Feb 11, 2023

And to be fair, leading dot does also influence ST behavior as long as we use filterText as a trigger. Below are two completions: x without filterText and the other one with. The . is matched in the other one and make it rank higher.

Screenshot 2023-02-11 at 23 34 05

@predragnikolic
Copy link
Member

Are you able to dig out some more info (issues)?

Although not a question pointed to me,
but #976 seems related

And this comment here is useful

Here's an explanation from the maintainer of the protocol, on how it should work:
microsoft/language-server-protocol#898 (comment)

@rchl
Copy link
Member Author

rchl commented Feb 11, 2023

but #976 seems related

I'm not sure about that. It seems to be related to labels that span word boundaries and I don't think filterText would have any bearing on that.

And this comment here is useful

Kinda useful on understanding how filterText should work but kinda surprised that he said that completion item with textEdit and without filterText should never be filtered out. In his example this completion:

	item = CompletionItem.create('foo-text-range-insert');
	item.textEdit = TextEdit.insert(params.position, 'foo-text-range-insert');
	result.push(item);

shows up when typing b in the document...

@rchl
Copy link
Member Author

rchl commented Feb 12, 2023

The conclusion is that it's a job of the server to not set weird filterText values like that.

Also that ST should have a way to filter on a separate (invisible) text.

@rchl rchl closed this as not planned Won't fix, can't repro, duplicate, stale Feb 12, 2023
@jwortmann
Copy link
Member

In his example this completion [...] shows up when typing b in the document...

Yes, that seems weird, but I think it's not really relevant for us anyway, because the only information in the completion item and what we can therefore use as "trigger" in this case is the foo-text-range-insert. And even if we provide that as a completion to ST, it won't be shown because it's already filtered out by ST due to the prefix.

As far as I understand the linked comment, filterText seems only to be relevant if the completion changes what was typed as the prefix: "If the replace changes the text it most likely makes sense to specify a filter text to be used."

Like in the given example if you have typed .x-y and the dot gets removed and converted to ["x-y"]. But all the necessary information for that is already contained in the text edit range, and the server already knows what the typed prefix is, so I still don't see a reason why you would want to prepend the dot in the filterText in that particular case.

One other idea what we maybe could do to improve the "trigger", would be if label is contained in filterText and also matches the typed prefix, then we could strap away the part of the filterText which is to the left of the label. In this case we are still on the safe side, because the part to the left of what was already typed won't affect further filtering anyway. For example with {"filterText": ". $attrs", "label": "$attrs"} we could use $attrs for the trigger. And for the other example with

{"filterText": "import { readConfigFile$1 } from 'typescript';", "label": "readConfigFile"}

we would show readConfigFile$1 } from 'typescript';. Then you can still type readconfts without it being filtered out. So this could improve the presentation at least in some cases.
(typescript-language-server should at least remove the tab triggers $1 from the filterText by itself though).

@jwortmann
Copy link
Member

Any opinions on the idea from my last comment?

Proof of concept implementation (untested and might have an impact on performance):

Click to reveal
diff --git a/plugin/completion.py b/plugin/completion.py
index c7b9485..75e97ee 100644
--- a/plugin/completion.py
+++ b/plugin/completion.py
@@ -40,12 +40,22 @@ ResolvedCompletions = Tuple[Union[CompletionResponse, Error], 'weakref.ref[Sessi
 CompletionsStore = Tuple[List[CompletionItem], CompletionItemDefaults]
 
 
+def fuzzy_match(text: str, prefix: str) -> bool:
+    for letter in prefix:
+        idx = text.find(letter)
+        if idx == -1:
+            return False
+        text = text[idx+1:]
+    return True
+
+
 def format_completion(
     item: CompletionItem,
     index: int,
     can_resolve_completion_items: bool,
     session_name: str,
     item_defaults: CompletionItemDefaults,
+    prefix: str,
     view_id: int
 ) -> sublime.CompletionItem:
     # This is a hot function. Don't do heavy computations or IO in this function.
@@ -74,6 +84,13 @@ def format_completion(
             details.append(html.escape(lsp_label + lsp_label_detail))
         if lsp_label_description:
             details.append(html.escape(lsp_label_description))
+    elif " " in lsp_filter_text and lsp_label in lsp_filter_text and fuzzy_match(lsp_label, prefix):
+        trigger = lsp_filter_text[lsp_filter_text.find(lsp_label):]
+        annotation = lsp_detail
+        if lsp_label_detail:
+            details.append(html.escape(lsp_label + lsp_label_detail))
+        if lsp_label_description:
+            details.append(html.escape(lsp_label_description))
     else:
         trigger = lsp_filter_text
         annotation = lsp_detail
@@ -166,11 +183,13 @@ class QueryCompletionsTask:
     def __init__(
         self,
         view: sublime.View,
+        prefix: str,
         location: int,
         triggered_manually: bool,
         on_done_async: Callable[[List[sublime.CompletionItem], int], None]
     ) -> None:
         self._view = view
+        self._prefix = prefix
         self._location = location
         self._triggered_manually = triggered_manually
         self._on_done_async = on_done_async
@@ -232,7 +251,14 @@ class QueryCompletionsTask:
             config_name = session.config.name
             items.extend(
                 format_completion(
-                    response_item, index, can_resolve_completion_items, config_name, item_defaults, self._view.id())
+                    response_item,
+                    index,
+                    can_resolve_completion_items,
+                    config_name,
+                    item_defaults,
+                    self._prefix,
+                    self._view.id()
+                )
                 for index, response_item in enumerate(response_items)
                 if include_snippets or response_item.get("kind") != CompletionItemKind.Snippet)
         if items:
diff --git a/plugin/documents.py b/plugin/documents.py
index df80930..982e8ba 100644
--- a/plugin/documents.py
+++ b/plugin/documents.py
@@ -487,18 +487,18 @@ class DocumentSyncListener(sublime_plugin.ViewEventListener, AbstractViewListene
         triggered_manually = self._auto_complete_triggered_manually
         self._auto_complete_triggered_manually = False  # reset state for next completion popup
         sublime.set_timeout_async(
-            lambda: self._on_query_completions_async(completion_list, locations[0], triggered_manually))
+            lambda: self._on_query_completions_async(completion_list, prefix, locations[0], triggered_manually))
         return completion_list
 
     # --- textDocument/complete ----------------------------------------------------------------------------------------
 
     def _on_query_completions_async(
-        self, clist: sublime.CompletionList, location: int, triggered_manually: bool
+        self, clist: sublime.CompletionList, prefix: str, location: int, triggered_manually: bool
     ) -> None:
         if self._completions_task:
             self._completions_task.cancel_async()
         on_done = partial(self._on_query_completions_resolved_async, clist)
-        self._completions_task = QueryCompletionsTask(self.view, location, triggered_manually, on_done)
+        self._completions_task = QueryCompletionsTask(self.view, prefix, location, triggered_manually, on_done)
         sessions = list(self.sessions_async('completionProvider'))
         if not sessions or not self.view.is_valid():
             self._completions_task.cancel_async()

@rchl
Copy link
Member Author

rchl commented Feb 13, 2023

Not at the computer now but i have a feeling that ST does not report a . prefix. I believe prefix is empty when a dot is typed. Would that be a problem for your approach?

@rchl
Copy link
Member Author

rchl commented Feb 17, 2023

@jwortmann some related info that I wasn't quite aware of (maybe you were): typescript-language-server/typescript-language-server#631 (comment) (read till the end).

I've confirmed that behavior with VSCode.

@rchl
Copy link
Member Author

rchl commented Feb 17, 2023

It seems like a proper thing to do would be to in the first place get the text from the document that matches the textEdit replace range. With this information we would be equipped with missing bit that would allow us to pick better candidate for display. It seems fairly expensive to compute that for every completion that contains textEdit though.

@jwortmann
Copy link
Member

I was not exactly aware of that, but I've commented in that issue.
So as a conclusion, it is correct that the filterText should include these weird prefixes with . or other charcters, because it must start at the CompletionItem.textEdit.range.start, which determines the prefix (otherwise the completion item would be filtered out).

It seems like a proper thing to do would be to in the first place get the text from the document that matches the textEdit replace range. With this information we would be equipped with missing bit that would allow us to pick better candidate for display.

But I wonder what could we exactly do with that information?
It might still be useless to include the leading part (like .) in the trigger, because we have no control about what ST uses as prefix (and it indeed doesn't seem to include dots or other special leading symbols).

@rchl
Copy link
Member Author

rchl commented Feb 18, 2023

But I wonder what could we exactly do with that information?

Hmm, I haven't thought too much of that and now I'm thinking that maybe I was too optimistic. I was thinking that in a simple case where the range includes the period (made up example below) it would somehow help to pick the label as the one to show (since the dot doesn't contribute to filtering in ST anyway). But I don't see how we could do that now...

{
  "filterText": ".charAt",
  "insertText": ".charAt",
  "insertTextFormat": 1,
  "kind": 5,
  "label": "charAt",
  "sortText": "11",
  "textEdit": {
    "newText": ".charAt",
    "range": {
      "end": {
        "character": 1,
        "line": 0
      },
      "start": {
        "character": 2,
        "line": 0
      }
    }
  }
}

@jwortmann
Copy link
Member

I also have another criterion for the completion trigger:
If the filterText is very long, prefer label.

(Assuming that label is shorter, and where the exact amount of characters for what "very long" means still has to be determined/tested, but maybe around 40 characters or so). The reason is that ST will add lots of "..." into the trigger by itself, and even uses that as the internal text for matching.

For example with LSP-TexLab, and when the completion popup for a citation label appears, the filterText will include everything contained in the corresponding bibliography entry. For testing use some .bib file, for example

@book{SNPO08,
    author    = {Eduardo de Souza Neto and Djordje Perić and David Owen},
    title     = {Computational Methods for Plasticity},
    subtitle  = {Theory and Applications},
    publisher = {John Wiley \& Sons, Ltd},
    date      = {2008},
    isbn      = {978-0-470-69452-7}
}

and write in a LaTeX document

\documentclass{article}

\usepackage{biblatex}
\addbibresource{bibliography.bib}

\begin{document}

\cite{

Completion response:

{
    "isIncomplete": false,
    "items": [
        {
            "kind": 22,
            "preselect": false,
            "data": {
                "citation": {
                    "uri": "file:///D:/code/LaTeX/LspTest/bibliography.bib",
                    "key": "SNPO08"
                }
            },
            "textEdit": {
                "newText": "SNPO08",
                "range": {
                    "end": {
                        "character": 6,
                        "line": 7
                    },
                    "start": {
                        "character": 6,
                        "line": 7
                    }
                }
            },
            "label": "SNPO08",
            "filterText": "SNPO08 @book SNPO08 author Eduardo de Souza Neto and Djordje Perić and David Owen title Computational Methods for Plasticity subtitle Theory and Applications publisher John Wiley \\& Sons Ltd date 2008 isbn 978-0-470-69452-7",
            "sortText": "00 SNPO08 @book SNPO08 author Eduardo de Souza Neto and Djordje Perić and David Owen title Computational Methods for Plasticity subtitle Theory and Applications publisher John Wiley \\& Sons Ltd date 2008 isbn 978-0-470-69452-7"
        }
    ]
}

texlab

I guess the idea is that you don't need to remember the cryptic bib-entry key (SNPO08 in this case), but instead can type the book title or the author names to filter the suggestions. But due to Sublime's strange narrowing technique for the long completion trigger, it doesn't actually work in this editor. So I guess here it would be better to directly show the label ( = entry key) instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants