Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort code autocompletion options by similarity based on input #65655

Closed

Conversation

Mickeon
Copy link
Contributor

@Mickeon Mickeon commented Sep 11, 2022

This PR no longer adds. Maybe another time.

To see how the original algorithm sort-of-behaved, click here:

Added in the String class, the algorithm assigns a likeness value between 0.0 and 1.0 to the String calling it, relying on the base String parameter.

If either Strings are empty it returns 0.0.
If base is not a subsequence of the String it returns 0.0.

The algorithm aims to give higher scores to shorter Strings that contain the base's characters closer to the beginning of the String, and closer between each other. For example:

String base = "pos"

String("pos").assign_score(base);                        // 1.0
String("post").assign_score(base);                       // 0.938
String("position").assign_score(base);                   // 0.844
String("current_animation_position").assign_score(base); // 0.260
String("potato").assign_score(base);                     // 0.0

Showcase

BEFORE AFTER
image image
BeforeModulate AfterModulate
BeforeMocl AfterMocl
A short example of the options moving as values closer to the base are prioritised: And a mock-up of how this works, made in GDScript.
Vector.floor GDScript Mockup

To make it really brief, this PR uses a combination String.similiary, the category system introduced in a previous PR, and some filtering to yield more predictable results, instead of scattering every completion option at seemingly random.

It also gives much higher priority to strings that contain the base in full, closer to the beginning or are perfect matches,

I may add more mock-ups at a later time. But I'm only one person, not writing on Godot 4 full-time yet. As such...

I encourage people to please try this out by pulling this branch!

Final notes:

  • This PR moves the autocompletion sorting all the way after all options have been filtered out, instead of before. I feel like there's no point sorting values if some are going to be discarded. Please do tell if there is something I'm missing.
  • Known issues:
    • This PR no longer highlights the characters that are part of the search. image I do not know how to accomplish that right now.
  • I do need proper names for the internal function, do suggest, please!
  • If necessary I can open a proposal, but this is not stuff most users are particularly "savvy" about.
  • Special thanks to @fire and @KoBeWi for helping me using the custom iterator.
  • See Sort code autocompletion options by similarity based on input #65655 (comment)

Bugsquad edit: This closes #63706.

@Mickeon Mickeon requested review from a team as code owners September 11, 2022 13:38
@Mickeon Mickeon force-pushed the editor-autocompletion-sort-by-score branch 2 times, most recently from be5e503 to d79fda7 Compare September 11, 2022 13:56
@Mickeon
Copy link
Contributor Author

Mickeon commented Sep 11, 2022

This compiles locally but it doesn't in the checks...

@RedMser
Copy link
Contributor

RedMser commented Sep 11, 2022

This compiles locally but it doesn't in the checks...

I assume because you're including an editor header file from outside of the editor folder. So when making a tools=no build, the editor folder is not included in the compile and the build fails because of that.

@Mickeon
Copy link
Contributor Author

Mickeon commented Sep 11, 2022

I assume because you're including an editor header file from outside of the editor folder. So when making a tools=no build, the editor folder is not included in the compile and the build fails because of that.

Ah, oh boy. This one is going to be... interesting to solve. Oh no...

@Mickeon Mickeon force-pushed the editor-autocompletion-sort-by-score branch from d79fda7 to 8bcebc6 Compare September 11, 2022 14:35
@Mickeon
Copy link
Contributor Author

Mickeon commented Sep 11, 2022

Hastily attempting to address it so that this can be tested as soon as possible.

@RedMser
Copy link
Contributor

RedMser commented Sep 11, 2022

I'm very glad you're tackling this long standing issue. But this current solution still does not seem ideal and should probably combine multiple methods of assigning score.

For example, in a script extending Node, autocompletion for tree looks like this:

image

It's great that it finds matches that use the same letters in the same order. But the algorithm should definitely prioritize a full substring match (tree is found as-is in get_tree), as well as further prioritizing begins_with matches (so the tree_* signals should be first).

@Zireael07
Copy link
Contributor

@RedMser how autocompletion should work has been bikeshedded to death or close to it. Some favor 'full matches' as you do, some favor what @Mickeon did here and personally I think it needs a follow-up PR and an option switch

@Mickeon
Copy link
Contributor Author

Mickeon commented Sep 11, 2022

It's great that it finds matches that use the same letters in the same order. But the algorithm should definitely prioritize a full substring match (tree is found as-is in get_tree), as well as further prioritizing begins_with matches (so the tree_* signals should be first).

It would somewhat do as you say, if the implementation of #59633 were to be completely replaced with this method. But instead, I attempted combining the two.

Currently, and before, the order is kind of like this.
[Node2D, Node properties], then [Node2D, Node methods], then [Node2D, Node constants] ...
So each option is sorted by score only in between the same kind.

What I could experiment with, is something along the lines of... "If the score is high enough, it ignores the kind of option and brings it all the way up the list", But honestly, right now I'm very fearful because some checks just do not compile! Honestly the screenshot you showed me is personally odd already, I think I broke something while attempting to make this compile.

@RedMser
Copy link
Contributor

RedMser commented Sep 11, 2022

and personally I think it needs a follow-up PR and an option switch

So each option is sorted by score only in between the same kind and the same class

This indeed is probably the kind of thing that should introduce an editor setting or two (group by inheritance level, group by field/method/constant, etc.). Tbh I had no idea how it currently worked, so maybe my suggestion does not work well and would need a greater rewrite which is a lot to ask for. 😅
Your changes are already a large improvement since the list shows more plausible matches.

right now I'm very fearful because some checks just do not compile

@Mickeon I'm not really well-versed with C++ but you could try removing the _FORCE_INLINE_ for the comparator function and/or putting the function body directly into the header file. Seems like every comparator in the code base is implemented a bit differently in that regard.

scene/gui/code_edit.cpp Outdated Show resolved Hide resolved
@Mickeon Mickeon marked this pull request as draft September 11, 2022 19:21
core/string/ustring.cpp Outdated Show resolved Hide resolved
@Mickeon Mickeon force-pushed the editor-autocompletion-sort-by-score branch from 8bcebc6 to d308483 Compare September 12, 2022 06:55
@Mickeon
Copy link
Contributor Author

Mickeon commented Sep 12, 2022

@RedMser I have updated the calculation.
There's a large penalty the further away the base's characters are from each other:
GetTree
The score actually goes below 0.0, which shouldn't be the case (I don't want it to be the case), but it does improve the order of results considerably.
At some point later I will experiment with outright discarding any option whose score is below a certain threshold (basically humanly impossible to discern how the base is inside).

@Mickeon Mickeon force-pushed the editor-autocompletion-sort-by-score branch from d308483 to a3c87ab Compare September 12, 2022 08:07
@Mickeon Mickeon requested review from bruvzg and KoBeWi and removed request for bruvzg and KoBeWi September 12, 2022 08:12
@EricEzaM
Copy link
Contributor

EricEzaM commented Jan 9, 2023

Ok I have revisited this again. Found more stuff that I may have overlooked in my initial PR, or has been changed since which messed some things up. diff on this PR: https://github.com/Mickeon/godot/compare/editor-autocompletion-sort-by-score...EricEzaM:65655?expand=1

If you want to test, try the build artefact from here when it finishes: https://github.com/EricEzaM/godot/actions/runs/3874969199
See below for my testing results using peoples responses in this thread. It looks ok to me but there may be some edge cases.

Test results for every example in this thread: [toggle - long image]

image

Also fixes #71059
godot windows editor dev x86_64_64kbk3G1k2

P.S. 'Container Sizing' should not be there in 2nd capture. That is a property group. Issue for another PR though.

@ajreckof
Copy link
Member

ajreckof commented Jan 16, 2023

212613507-11c63726-5d3f-4c5e-a8c0-8389a1bb25fe

in this screenshot, fourth results seems inappropriate. I don't know why this one was ranked so high. Is it because it ends with te? if so i don't think ending with a match should be ranked as high as starting with a match

@EricEzaM
Copy link
Contributor

@ajreckof it's because it is a local function on the class.

It's true that it might be better for things which are local but are bad matches to be pushed lower.

@ajreckof
Copy link
Member

yeah it feels conter productive for them to show when they are not that much pertinent. I think results should first be sorted for match and on same level of match being sorted by scope.

@Kakiroi
Copy link

Kakiroi commented Jan 16, 2023

image
Is there a way to remedy this? I don't think it's that useful for the first match to come up for such common abbreviation. IF I wanted the first match, I would search 'Visual' first. Maybe give more points for entry that has matched letters from the start? (or early?)

Funny because, v3 actually gives Vector3 as first entry. Feels like shortening words give more accurate result.

@EricEzaM
Copy link
Contributor

EricEzaM commented Jan 16, 2023

That is what this work attempts to resolve.

Oh I see, that is with this PR? I'll take a look.

@ajreckof
Copy link
Member

ajreckof commented Jan 16, 2023

maybe there should be a penalty by how many times it is separated. The more parts there is to find the word the farther it should be ? this would be a more general way to differentiate than just it is in one part it is in multiple parts (don't know how hard it would be to implement sorry)

@EricEzaM
Copy link
Contributor

EricEzaM commented Jan 18, 2023

I have tried to resolve the feedback above. I ended up adding a levenshtein distance implementation that I pinched from here (wikibooks - cc0). I more or less understand how it works but there is a nice implementation already documented so we're standing on the shoulders of giants etc etc

I also got rid of the sorting based on 'kind' (constant, function, signal, etc) it probably added additional complexity without measurable benefit.

I kind of hate how there are some 'magic numbers' in the implementation though....

Test results for every example in this thread: [toggle - long image]

y0BzDm72Qg

Note one annoying thing is that the text that is highlighted is not actually the sections of text that were 'matched' by the sorting algo. The thing that the highlights uses is just a subsequence search so it sometimes highlights the wrong thing (e.g. get_tree above - the search algo actually sorted it based on tree exactly matching.

download (once action is done running)
diff with the current state of this PR
main sorting algo location in code

ps @Kakiroi that long thing was being ranked higher since it had vec3 as an exact match where nothing else did. So I added some stuff to disregard exact match if it is very far away from the start.

@Kakiroi
Copy link

Kakiroi commented Jan 19, 2023

Phenomenal. Already feels so much better. Also good call on removing sorting by type.
image

I'm assuming class is ranked lower? Or does it ignores capitalization? Or maybe this is intended?
Also, it is making this error whenever I type "ti".
image

Good luck, chief.

Mickeon and others added 4 commits January 29, 2023 22:00
To make it really brief, uses a combination `String.similiary`, the category system introduced in a previous PR, and some filtering to yield more predictable results, instead of scattering every completion option at seemingly random.

It also gives much higher priority to strings that contain the base in full, closer to the beginning or are perfect matches.

Also moves CodeCompletionOptionCompare to code_edit.cpp
… to improve as many different use cases as possible
@YuriSizov YuriSizov modified the milestones: 4.0, 4.1 Feb 10, 2023
@Mickeon
Copy link
Contributor Author

Mickeon commented Feb 12, 2023

@EricEzaM Could you guide me to bring this PR to the way it is done in your branch? A git patch, perhaps?

@EricEzaM
Copy link
Contributor

I think my branch is based on this branch so if you pull it down it should work. I have done that before with other people branches (non-pr).

Anyway, @ajreckof expressed interest in investing a solution further since while I was making progress on it, I was still not satisfied and perhaps others would have better ideas. Their solution is looking pretty good, but not polished up yet. @ajreckof Maybe this is a decent time to share your work?

@Mickeon Mickeon force-pushed the editor-autocompletion-sort-by-score branch from c8ea22d to 30cc0bb Compare February 12, 2023 12:01
@ajreckof
Copy link
Member

ajreckof commented Feb 12, 2023

So here is my work https://github.com/ajreckof/godot/tree/editor-autocompletion-sort-by-rules. It is based on your branch too so if you feel like adding it you can always just pull from it.

What it does is mostly make the match shown the correct ones. Which subsequence is kept is based on the ordering so that if later on we modify the way matches are ranked the right match will still show the best one.
Then it improves and simplifies the ordering by relying only on the shown match.

@akien-mga
Copy link
Member

Superseded by #75746. Thanks for the contribution nevertheless!

@akien-mga akien-mga closed this Jun 8, 2023
@Mickeon Mickeon deleted the editor-autocompletion-sort-by-score branch December 30, 2023 11:50
@AThousandShips AThousandShips removed this from the 4.1 milestone Dec 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autocomplete matching should be case insensitive, otherwise case mismatch leads to confusing results