-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better search results #1067
Comments
From dilettant on 2013-01-02 12:25:45+00:00 Hi Hernan, thanks a lot for transforming the mail thread to improve the local search facility into real code suggestions. Just one early comment w.r.t. the above linked patch 01-* (the indexer) at hunk:
I would expect not the logic present, but either: A) B) Or am I totally wrong here? All the best, |
From Hernan Grecco on 2013-01-02 13:04:45+00:00 As the code is testing for class (instead of using
Thanks for the feedback. |
From Hernan Grecco on 2013-01-03 03:49:07+00:00 Patches to change the scoring of the search results. |
From Hernan Grecco on 2013-01-03 03:58:11+00:00 I have implemented the last patch, adding support for pluggable scoring mechanism. A simple scorer for the Python Docs could look like this. The output is current, proposed, comparison. There is a lot of room for tweaking the values but I think that the results are promising. |
From Georg Brandl on 2013-01-03 07:45:54+00:00 Thanks for this effort, Hernan. I will have a look shortly; in the meantime, could you resubmit in the form of a pull request? There it is possible to comment on the patches inline, and easier to update the changes for revisioning. |
From Hernan Grecco on 2013-01-03 10:55:26+00:00 Hi George, I am happy to help. I have just seen that you have merged. That was really fast!. I was preparing some reorganization of the commits into more logical parts (a few things are more clear when you finish them). In case you are still interested, they are in the my fork of the repo. It is still the same code. The only difference is that the scorer I built for Python Docs is now the default. |
From Georg Brandl on 2013-01-03 20:45:13+00:00 Hi Hernan, I've merged the first two patches and fixed a few issues while doing that. I'm currently merging patch 3, so I'll have a look at your repo for the new scorer. |
From Hernan Grecco on 2013-01-03 22:39:40+00:00 I think that the Scorer in my branch should be the default one, as it performs much better that the previous one. In addition, using this scorer means that a commit in the cpython tree will not be necessary. Be careful that any issues that you have found (like the one fixed by c1e2c90) will also be there. Let me know if I can help. |
This proposal is motivated by the desire to have better search results in the Python Docs. See mail thread
While a Google search will always yield better results, I think that there is room for improvement without increasing the complexity of the sphinx codebase. However, what constitutes a good result might depend on the project using sphinx. Therefore, the proposed solution has a javascript snippet that can be inserted in searchtool.js in a similar way to the language related code (stemmer, stop words).
This modification increase the size of the index between 2% and 10% depending on the project docs (I tested Python, sphinx and flask). The time to generate the index does not change significantly, at least when the docs are generated from scratch. See patch 01.
2.- Modify the search tool to create single result set (instead of the current 4: regular, important, unimportantResults, objectResults). Each result has an associated score. Sort by score before presenting the results. This modification does not seem to change the search time significantly, but I will be happy if somebody could provide better stats. See patch 02.
3.- Create a pluggable scoring javascript mechanism that can be easily changed by the projects using sphinx (e.g. in the theme or in conf.py) (ToDo)
This plan seems promising, but I would appreciate some feedback before moving on.
Hernan
The text was updated successfully, but these errors were encountered: