-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we keep the new boundary-aware scoring algorithm? #80
Comments
@rschmitt That's usually called "smart case", and it's pretty common in interactive search systems these days. E.g. built-in interactive search for both Emacs and Vim has had this for ages. |
@jwhitley That rings a bell. I've gone ahead and implemented it in Heatseeker (rschmitt/heatseeker@7a3aa4b). So far it seems like an awesome improvement for languages that conventionally use CamelCase filenames--Java, Scala, Haskell, some C++, etc. |
Have to agree with @rschmitt here. So far the only issue I ran into. |
I prefer the new scoring. I noticed that a query of Most of the time I imagine selecta is used to match file paths. File paths aren't uniformly weighted; the tail of the path is more specific, in a way, than the head (big-endian?). Therefore I was wondering about matching from the more specific to the less specific, i.e. from right to left. Clearly you can't just reverse the query and the choices and pass those to the scoring algorithm. I can't quite tell at the moment how to change the algorithm, and of course benchmarking might well rule it out. But I thought I'd mention it. |
I see that the algorithm favor directory matches instead of file matches in certain conditions, here's an example of a chef project I'm working on: Note that PS: There are no more files in this example, all of them show up in this screenshot. |
I think it's a general improvment, I'm still getting acquainted to the new behaviour, learning new "first hits", etc. The boundary-aware matching hasn't worked as I expect in a few cases:
I would expect
I would expect
A variant of the above, but I would definitely expect
I think this is similar to the case @airblade mentioned. I'm expecting If a primary use case of selecta is selecting files, then I think that matches "further" into the strings should have more weight, as the "deeper" you go the more specific the match is to that string. It might help if I give an example of where this approach definitely works. Imagine you've got a Rails project-like structure:
This splits on boundaries into something like:
When I search for something like
Currently we get:
If I refine the search to
Currently we get:
I hope that's in some way useful. |
The UI now prints paths with the correct case; that was a silly little bug. I think that smart case seems like a good idea, but it sounds hairy and I want to put it off for a bit since it should be independent of these recent algorithm changes. Comments on left-vs-right in a moment. |
I see two possible adjustments for left vs. right matching:
I think that (1) should definitely be done, but (2) may not be worth it. Comments on specific matching examples in yet another moment... |
In @airblade's example of querying "banjo/app/models/user.rb" for "amuser", the score is 3 because the first character isn't considered for purposes of the boundary and sequential character bonuses. It definitely should be, but I didn't see an obvious way to implement it that way, so I cowardly punted on it. For @gshutler's examples, in order:
I think that we should:
|
I've made the scoring algorithm smarter about sequential matching characters and word boundaries (to improve results when querying for acronyms). It's merged to master, along with some other changes, in d874c99. The README contains a summary (search it for "algorithm").
I'd love to hear feedback from actual Selecta users, especially after you've used it on actual projects.
The text was updated successfully, but these errors were encountered: