Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Fix scoring to favor exact substring matches #15

Closed
wants to merge 2 commits into from

Conversation

brandonwamboldt
Copy link

So I made pretty radical changes to scoring and don't expect this to be accepted without tweaks, but I figured it would be a good way to get momentum on the problem.

We have an issue with exact substring matches being scored lower than non substring matches. My go to example is:

Find And Replace: Select All
Settings View: Install Packages And Themes

That's the sorted order when searching for "Install", clearly not what you'd expect. This changes the scoring system to heavily weight substring matches, so if you type a word exactly as it appears in the string, it will be weighted highly (nearly as a high as an exact match).

Should address atom/atom#6461 and atom/command-palette#28.

@brandonwamboldt
Copy link
Author

FYI I force pushed a better version just now. My old way didn't work consistently, now I add a value to the finalized score for substring matches (like we do for basename matches), which yields more accurate results.

@@ -112,6 +112,9 @@ describe "filtering", ->
expect(bestMatch(['z_a_b', 'a_b'], 'ab')).toBe 'a_b'
expect(bestMatch(['a_b_c', 'c_a_b'], 'ab')).toBe 'a_b_c'

it "weights matches that are substring matches higher", ->

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weighs 🎨

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, I think this actually makes more sense with weights

@brandonwamboldt
Copy link
Author

Also what's the etiquette on corrections? I just added a second commit to be safe, as if the person comments on a commit and not the PR diff, comments are lost when you force push, but you also don't want to merge in a PR full of unnecessary commits.

Just curious what you guys prefer? Or do you just squash & merge manually?

@winstliu
Copy link

@brandonwamboldt In most cases, the Atom team doesn't care too much about how many commits are in a PR when merging. A few good examples of this are the recent pane-resize PR (24 commits) and the gutter API PR (53 commits). Your choice in the end though :).

@mnquintana Generally it seems like weighs is to measure something while weights is to hold something down, so I'd lean towards weighs here.

@brandonwamboldt
Copy link
Author

Roger that, thanks.

Also, to explain why I added 1 to the weight, which is 1 or lower at this point in the code, is that I think substring matches should always be ranked higher than non substring matches, and adding 1 is the only way to accomplish that. Now, substring matches are ranged from 1.0 to 2.0, and non substring matches are ranged from 0.0 to 1.0.

However, I'm also adding a value if:

  • The substring match is at the beginning of the string (highest bonus)
  • The substring match is right after a path separator (second highest)

@mnquintana
Copy link
Contributor

@50Wliu: Generally it seems like weighs is to measure something while weights is to hold something down, so I'd lean towards weighs here.

From the Apple dictionary:

weight - verb [ with obj. ]
2) attach importance or value to

I've definitely seen weight used in this way to refer to search rankings before.

@thomasjo
Copy link

It is definitely correct to use weight (and hence weights) in this context.

@mnquintana
Copy link
Contributor

/cc @atom/feedback

@@ -112,6 +112,9 @@ describe "filtering", ->
expect(bestMatch(['z_a_b', 'a_b'], 'ab')).toBe 'a_b'
expect(bestMatch(['a_b_c', 'c_a_b'], 'ab')).toBe 'a_b_c'

it "weighs matches that are substring matches higher", ->
expect(bestMatch(['/a/b/c/install.txt', 'inst-all.txt'])).toBe '/a/b/c/install.txt'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other examples pass in an array of candidates and then a search string. All you're passing in here are candidates?

@walles
Copy link

walles commented Jul 6, 2015

Could you add a test case for your real-world example that you mention in the initial comment? With "Settings View: Install Packages And Themes" being at place ten when searching for "install". Preferably with all intermediate lines in it as well.

That's a good way to know that the patch addresses the right problem.

@walles
Copy link

walles commented Jul 9, 2015

Given that PR #22 has test cases for some nice things that this PR does not, I'd prefer #22 being merged over this one.

Or that this PR is updated with the tests for git push and installfrom #22.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants