-
Notifications
You must be signed in to change notification settings - Fork 28
Fix scoring to favor exact substring matches #15
Conversation
c91837a
to
21e1861
Compare
FYI I force pushed a better version just now. My old way didn't work consistently, now I add a value to the finalized score for substring matches (like we do for basename matches), which yields more accurate results. |
@@ -112,6 +112,9 @@ describe "filtering", -> | |||
expect(bestMatch(['z_a_b', 'a_b'], 'ab')).toBe 'a_b' | |||
expect(bestMatch(['a_b_c', 'c_a_b'], 'ab')).toBe 'a_b_c' | |||
|
|||
it "weights matches that are substring matches higher", -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weighs
🎨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno, I think this actually makes more sense with weights
Also what's the etiquette on corrections? I just added a second commit to be safe, as if the person comments on a commit and not the PR diff, comments are lost when you force push, but you also don't want to merge in a PR full of unnecessary commits. Just curious what you guys prefer? Or do you just squash & merge manually? |
@brandonwamboldt In most cases, the Atom team doesn't care too much about how many commits are in a PR when merging. A few good examples of this are the recent pane-resize PR (24 commits) and the gutter API PR (53 commits). Your choice in the end though :). @mnquintana Generally it seems like |
Roger that, thanks. Also, to explain why I added 1 to the weight, which is 1 or lower at this point in the code, is that I think substring matches should always be ranked higher than non substring matches, and adding 1 is the only way to accomplish that. Now, substring matches are ranged from 1.0 to 2.0, and non substring matches are ranged from 0.0 to 1.0. However, I'm also adding a value if:
|
From the Apple dictionary:
I've definitely seen weight used in this way to refer to search rankings before. |
It is definitely correct to use weight (and hence weights) in this context. |
/cc @atom/feedback |
@@ -112,6 +112,9 @@ describe "filtering", -> | |||
expect(bestMatch(['z_a_b', 'a_b'], 'ab')).toBe 'a_b' | |||
expect(bestMatch(['a_b_c', 'c_a_b'], 'ab')).toBe 'a_b_c' | |||
|
|||
it "weighs matches that are substring matches higher", -> | |||
expect(bestMatch(['/a/b/c/install.txt', 'inst-all.txt'])).toBe '/a/b/c/install.txt' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other examples pass in an array of candidates and then a search string. All you're passing in here are candidates?
Could you add a test case for your real-world example that you mention in the initial comment? With "Settings View: Install Packages And Themes" being at place ten when searching for "install". Preferably with all intermediate lines in it as well. That's a good way to know that the patch addresses the right problem. |
So I made pretty radical changes to scoring and don't expect this to be accepted without tweaks, but I figured it would be a good way to get momentum on the problem.
We have an issue with exact substring matches being scored lower than non substring matches. My go to example is:
That's the sorted order when searching for "Install", clearly not what you'd expect. This changes the scoring system to heavily weight substring matches, so if you type a word exactly as it appears in the string, it will be weighted highly (nearly as a high as an exact match).
Should address atom/atom#6461 and atom/command-palette#28.