Skip to content

Uploading the sitemap to S3 with paperclip, aws s3 and aws sdk

Anthony edited this page Mar 6, 2017 · 4 revisions

I was in need of uploading the sitemap files to S3 and notify the search engines of the file location, but I was using paperclip, aws-s3 and aws-sdk so installing carrierwave was not really a nice way to go, also I didn't want to alter the behavior of the gem, so I ended writing a rake task file that handle all of that, hopefully if you ran into my situation you will be able to pick my code and perhaps to improve it.

You can check the code here: https://gist.github.com/gists/1693860 (link is dead)

require 'aws'

class Rake::Task
  def replace &block
    @actions.clear
    prerequisites.clear
    enhance &block
  end
end

namespace 'sitemap' do
  desc 'Upload the sitemap files to S3 (using your configuration in config/s3.yml)'
  task :upload_to_s3 => :environment do
    if File.exist?(File.join(Rails.root, "config", "s3.yml"))

      # Load credentials
      s3_options = YAML.load_file(File.join(Rails.root, "config", "s3.yml"))[Rails.env].symbolize_keys
      bucket_name = s3_options[:bucket]
      s3_options.delete(:bucket)

      # Establish S3 connection
      AWS.config(s3_options)

      Dir.entries(File.join(Rails.root, "public", "system", "sitemaps")).each do |file_name|
        next if ['.', '..'].include? file_name
        path = "sitemaps/#{file_name}"
        file = File.join(Rails.root, "public", "system", "sitemaps", file_name)

        begin
          s3 = AWS::S3.new
          bucket = s3.buckets.create(bucket_name)


          object = bucket.objects[path]
          object.write(:file => file)
        rescue Exception => e
          raise
        end
        puts "Saved #{file_name} to S3"
      end
    end
  end
end

Rake::Task["sitemap:create"].enhance do
  Rake::Task["sitemap:upload_to_s3"].invoke
end

Rake::Task[:'sitemap:refresh'].replace do
  if File.exist?(File.join(Rails.root, "config", "s3.yml"))
    s3_options = YAML.load_file(File.join(Rails.root, "config", "s3.yml"))[Rails.env].symbolize_keys
    bucket_name = s3_options[:bucket]
    SitemapGenerator::Sitemap.ping_search_engines(:sitemap_index_url => "https://#{bucket_name}.s3.amazonaws.com/sitemaps/sitemap_index.xml.gz")
  else
    SitemapGenerator::Sitemap.ping_search_engines
  end
end

A continuation of this: http://status203.me/2015/04/11/rails-sitemap-heroku-aws/