Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a sitemap.xml file #318

Closed
tunetheweb opened this issue Nov 4, 2019 · 3 comments · Fixed by #345
Closed

Create a sitemap.xml file #318

tunetheweb opened this issue Nov 4, 2019 · 3 comments · Fixed by #345
Assignees
Labels
development Building the Almanac tech stack enhancement New feature or request SEO SEO related translation world wide web
Milestone

Comments

@tunetheweb
Copy link
Member

tunetheweb commented Nov 4, 2019

We should create a sitemap.xml file per supported language (e.g. https://almanac.httparchive.org/sitemap-en.xml for English) as part of the SEO recommendations discussed in #286.

This should be added to robots.txt being generated as part of #293 like so (multiple sitemaps are allowed):

User-agent: *
Allow: /
Sitemap: https://almanac.httparchive.org/sitemap-en.xml
Sitemap: https://almanac.httparchive.org/sitemap-jp.xml

This should be of the following format with one section per page for that language:

<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
  <loc>https://almanac.httparchive.org/en/2019/</loc>
  <lastmod>2019-11-04</lastmod>
</url>
<url>
  <loc>https://almanac.httparchive.org/en/2019/table-of-contents</loc>
  <lastmod>2019-11-04</lastmod>
</url>
...etc. for rest of non-chapter pages
<url>
  <loc>https://almanac.httparchive.org/en/2019/markup</loc>
  <lastmod>2019-11-04</lastmod>
</url>
...etc. for rest of chapter pages
</urlset>

The lastmod date can be derived from the date discussed in #317 so this could be dynamically generated on the fly, or as part of npm run generate.

@rviscomi rviscomi added development Building the Almanac tech stack enhancement New feature or request SEO SEO related labels Nov 4, 2019
@rviscomi rviscomi added the translation world wide web label Nov 4, 2019
@rviscomi rviscomi added this to the SHIP IT! milestone Nov 4, 2019
@AymenLoukil
Copy link
Contributor

AymenLoukil commented Nov 4, 2019

@bazzadp i would prefer to have a sitemap index that lists all the sitemaps.
So in the robots.txt we put just the sitemap index URL.

I wouldn't add the <changefreq></changefreq> attribute (not used by Google and not useful n our case for the other engines.)

So in the sitemap-en.xml we will loop through all the years pages ?

@tunetheweb
Copy link
Member Author

@bazzadp i would prefer to have a sitemap index that lists all the sitemaps.
So in the robots.txt we put just the sitemap index URL.

Either works for me. Whichever is easiest to implement.

I wouldn't add the <changefreq></changefreq> attribute (not used by Google and not useful n our case for the other engines.)

Fair enough - removed.

So in the sitemap-en.xml we will loop through all the years pages ?

Good question and not sure. Could have sitemap-en-2019.xml as well if easier.

@tunetheweb tunetheweb self-assigned this Nov 6, 2019
@tunetheweb tunetheweb changed the title Create a sitemap.xml file per supported language Create a sitemap.xml file Nov 6, 2019
@tunetheweb
Copy link
Member Author

I've done this as one sitemap.xml file and it handles all years and languages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Building the Almanac tech stack enhancement New feature or request SEO SEO related translation world wide web
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants