Skip to content

Commit

Permalink
🎁 Download Cloud Files later
Browse files Browse the repository at this point in the history
This commit will bring in changes from `5.3.1-british_library` to move
the download of cloud files to a background job.
  • Loading branch information
kirkkwang committed Feb 28, 2024
1 parent 7922cde commit ffb6f60
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 4 deletions.
19 changes: 16 additions & 3 deletions app/jobs/bulkrax/download_cloud_file_job.rb
Original file line number Diff line number Diff line change
@@ -1,18 +1,31 @@
# frozen_string_literal: true

module Bulkrax
class DownloadCloudFileJob < ApplicationJob
queue_as Bulkrax.config.ingest_queue_name

include ActionView::Helpers::NumberHelper

# Retrieve cloud file and write to the imports directory
# Note: if using the file system, the mounted directory in
# browse_everything MUST be shared by web and worker servers
def perform(file, target_file)
retriever = BrowseEverything::Retriever.new
last_logged_time = Time.zone.now
log_interval = 3.seconds

retriever.download(file, target_file) do |filename, retrieved, total|
# The block is still useful for showing progress, but the
# first argument is the filename instead of a chunk of data.
percentage = (retrieved.to_f / total.to_f) * 100
current_time = Time.zone.now

if (current_time - last_logged_time) >= log_interval
# Use number_to_human_size for formatting
readable_retrieved = number_to_human_size(retrieved)
readable_total = number_to_human_size(total)
Rails.logger.info "Downloaded #{readable_retrieved} of #{readable_total}, #{filename}: #{percentage.round}% complete"
last_logged_time = current_time
end
end
Rails.logger.info "Download complete: #{file['url']} to #{target_file}"
end
end
end
2 changes: 1 addition & 1 deletion app/parsers/bulkrax/csv_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ def retrieve_cloud_files(files, importer)
target_files << target_file
# Now because we want the files in place before the importer runs
# Problematic for a large upload
Bulkrax::DownloadCloudFileJob.perform_now(file, target_file)
Bulkrax::DownloadCloudFileJob.perform_later(file, target_file)
end
importer[:parser_fields]['original_file_paths'] = target_files
return nil
Expand Down

0 comments on commit ffb6f60

Please sign in to comment.