-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Conversation
@s-pace Oh ok. Thanks! I created facebook/docusaurus#765 to hopefully solve the sitemap issue that has been going on. |
Thanks @s-pace docsearch-configs/configs/docusaurus.json Lines 27 to 32 in d12e5f9
facebook/docusaurus#765 should help. It can avoid duplicate for other crawler as well (e.g Google) |
Yes indeed @endiliey, the EDIT: It also prevent us to crawl webpage not finishing by docsearch-configs/configs/docusaurus.json Line 32 in d12e5f9
|
Are we good now @s-pace ? I'm currently away but if you still need redirections on /docs/ I can try to implement it next week. |
👋 @endiliey, Thank you for being so available. I had to change the regex to its "opposite" since We are fine for the main pages and if you want to only search through the english pages so far. In order to have a proper filtering for versions and languages, we should use meta tags as described in facebook/docusaurus#744 (comment) and also a sitemap that exposes URL for every languages. Do you think it is doable to have the meta tags and the sitemap with every links from different languages? Thus every page will have its full context embedded and will be clearly referenced, no need to only rely on URLs. Let me know cc @JoelMarcey |
Naturally we want to be able to search for the correct languages & versions of docusaurus pages depending on current version & languages the user is on.
I am okay with relying on urls because it seems to works so far for many docusaurus user (see above examples). Docusaurus is used by many websites so if we had to use meta tags then all user of docusaurus might need to change their docsearch config. I'm trying to avoid having too many changes. But if the change is really necessary then we can try to work on it. The next thing that i want to talk about is that our sitemap actually exposes other URLs for other languages through Refer to You can use chrome developer tools because the chrome xml viewer is wrong What do you think @s-pace ? |
The current behaviour is working on the site you have mentioned because we only scrap one version at the time. You can have a look to the reason config for example. Since
Editing the config would be something most likely to happen since changing the outcome require to change the customised part. Regarding the sitemap, handling this extra feature might be a good way indeed. we need to investigate on such extra feature for the scraper. I will keep you posted about this one. |
@s-pace thanks for explaining. Seems that the problem is mostly on site with many versions & many languages like Docusaurus itself. What do you think @JoelMarcey ? If we are going ahead with the metadata tag I think we should agree on how the metadata tag should be formatted. @s-pace, would something like this be sufficient ?
Another example
|
Given the subset of Docusaurus sites that use languages or versioning, and even less that use both, I am ok adding the This wouldn't be a breaking change, but we could announce, when this is implemented, that to get full fidelity results in their search, that they should update their docsearch config appropriately. |
Definitely a 💯 Feel free to point me out where you need help |
You will have to update the search UI in order to restrain the scope of the search to the right faceFilters. |
Reverts #453
@JoelMarcey @endiliey
We already avoid duplicates. The redirection is not needed thank to your sitemap (we scrap it in order to find an available link.
ref facebook/docusaurus#744