Skip to content

Generate Sitemap and upload to s3 using aws sdk gem

Joe Masilotti edited this page Dec 28, 2021 · 3 revisions

Using AWS SDK

Add aws-sdk to Gemfile

gem 'aws-sdk'

The SitemapGenerator::AwsSdkAdapter uses aws-sdk. AwsSdkAdapter supports the following args/options:

:bucket_name

and

:aws_access_key_id,
:aws_secret_access_key,
:aws_region  

Initialize AwsSdkAdapter in the config/sitemap.rb configuration file:

SitemapGenerator::Sitemap.adapter = SitemapGenerator::AwsSdkAdapter.new(<bucket-name>,
                                     aws_access_key_id: <your-access-key-id>,
                                     aws_secret_access_key: <your-access-key>,
                                     aws_region: <your-aws-region e.g. us-west-2>)

config/sitemap.rb should appear as:

require 'rubygems'
require 'sitemap_generator'
# Set the host name for URL creation
SitemapGenerator::Sitemap.default_host = "http://www.example.com"
SitemapGenerator::Sitemap.sitemaps_host = "http://s3.ap-south-1.amazonaws.com/example-bucket-name/"
SitemapGenerator::Sitemap.public_path = 'tmp/'
SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'

SitemapGenerator::Sitemap.adapter = SitemapGenerator::AwsSdkAdapter.new("example-bucket-name", 
                                   aws_access_key_id: 'xxxxxxxxxxxxxxxxxx',
                                   aws_secret_access_key: 'xxxxxxxxxxxxxxxxxxxxxxx',
                                   aws_region: 'ap-south-1')

SitemapGenerator::Sitemap.create do
  # Put links creation logic here.
  #
  # The root path '/' and sitemap index file are added automatically for you.
  # Links are added to the Sitemap in the order they are specified.
  #
  # Usage: add(path, options={})
  #        (default options are used if you don't specify)
  #
  # Defaults: :priority => 0.5, :changefreq => 'weekly',
  #           :lastmod => Time.now, :host => default_host
  #
  # Examples:
  #
  # Add '/articles'
  #
  #   add articles_path, :priority => 0.7, :changefreq => 'daily'
  #
  # Add all articles:
  #
  #   Article.find_each do |article|
  #     add article_path(article), :lastmod => article.updated_at
  #   end

  add root_path, :changefreq => 'daily', :priority => 0.9
end

Run command to create sitemap from your project folder.

rake sitemap:create                              # Generate sitemaps but don't ping search engines

Run command to update sitemap from your project folder.

rake sitemap:refresh                              # Generate sitemaps but don't ping search engines

Output:

sitemap zip file will be created on aws s3 and will be available at

http://s3.ap-south-1.amazonaws.com/example-bucket-name/sitemaps/sitemap.xml.gz

Please let me know if you have any questions.

Bucket configuration

Ensure your S3 bucket has the following permissions otherwise uploading the sitemap might result in Aws::S3::Errors::AccessDenied.

  • s3:PutObject
  • s3:GetObject
  • s3:DeleteObject
  • s3:PutObjectAcl

The last one, s3:PutObjectAcl, is required to change the object's permissions to public read.