-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pages with noindex meta still showing in sitemap #6247
Comments
I don't like this idea a lot 😅 Although we might also need to read HTML files for the RSS feed content with MDX, see #5664 (comment) (some code could be shared, need to take care of HTML output file patterns etc) Another possibility is to do something similar to the broken link checker: add some extra logic to the This could be generic, good enough and transparent for the user, no extra API needed? |
That sounds good as well! 👍 |
Hello, I'm still encountering this problem, unfortunately. Is it currently somehow possible to exclude certain pages from the sitemap? I would like to not have to edit them manually every time after the build process. |
Have you read the Contributing Guidelines on issues?
Prerequisites
npm run clear
oryarn clear
command.rm -rf node_modules yarn.lock package-lock.json
and re-installing packages.Description
It is bad practice to add noindex pages to sitemap:
Steps to reproduce
<meta name="robots" content="noindex">
to the head tag of a page.Expected behavior
The sitemap doesn't include that page.
Actual behavior
It does—we don't filter routes by
noindex
.We should definitely conform to the
noIndex
config option and not output a sitemap in that case. For individual pages, we should probably read the HTML files and filter those withnoindex
? Or should we ask the user to provide a list of routes to ignore? (I think we should have both)Your environment
No response
Reproducible demo
No response
Self-service
The text was updated successfully, but these errors were encountered: