Skip to content

Commit

Permalink
Add cursor_column method to order the cursor
Browse files Browse the repository at this point in the history
`job-iteration` supports configuring which columns are used to order
the cursor. This commit provides a way to expose this ability in
the `maintenance-tasks` gem.
  • Loading branch information
bbraschi committed Feb 8, 2024
1 parent ed84901 commit 9ff94ee
Show file tree
Hide file tree
Showing 4 changed files with 82 additions and 1 deletion.
31 changes: 31 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,37 @@ to run. Since arguments are specified in the user interface via text area
inputs, it’s important to check that they conform to the format your Task
expects, and to sanitize any inputs if necessary.

### Custom cursor columns to improve performance

The [job-iteration gem](https://www.rubydoc.info/gems/job-iteration), on which this gem depends, adds an `order by`
clause to the relation returned by the `collection` method, in order to
iterate through records. It defaults to order on the `id` column.

The [job-iteration gem](https://www.rubydoc.info/gems/job-iteration) supports configuring which columns are used to order the cursor,
as documented in [JobIteration::EnumeratorBuilder.build_active_record_enumerator_on_records](https://www.rubydoc.info/gems/job-iteration/JobIteration/EnumeratorBuilder#build_active_record_enumerator_on_records-instance_method).

The `maintenance-tasks` gem exposes the ability that `job-iteration` provides to control the cursor columns, through the `cursor_columns` method in the `MaintenanceTasks::Task` class.
If the `cursor_columns` method returns `nil`, the query is ordered by the primary key.
If cursor columns values change during an iteration, records may be skipped or yielded multiple times.

```ruby
module Maintenance
class UpdatePostsTask < MaintenanceTasks::Task
def cursor_columns
[:created_at, :id]
end

def collection
Post.where(created_at: 2.days.ago...1.hour.ago)
end

def process(post)
post.update!(content: "updated content")
end
end
end
```

### Using Task Callbacks

The Task provides callbacks that hook into its life cycle.
Expand Down
15 changes: 14 additions & 1 deletion app/models/maintenance_tasks/task.rb
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,18 @@ def collection
self.class.collection_builder_strategy.collection(self)
end

# The columns used to build the `ORDER BY` clause of the query for iteration.
#
# If cursor_columns returns nil, the query is ordered by the primary key.
# If cursor columns values change during an iteration, records may be skipped or yielded multiple times.
# More details in the documentation of JobIteration::EnumeratorBuilder.build_active_record_enumerator_on_records:
# https://www.rubydoc.info/gems/job-iteration/JobIteration/EnumeratorBuilder#build_active_record_enumerator_on_records-instance_method
#
# @return the cursor_columns.
def cursor_columns
nil
end

# Placeholder method to raise in case a subclass fails to implement the
# expected instance method.
#
Expand Down Expand Up @@ -264,7 +276,7 @@ def enumerator_builder(cursor:)
when :no_collection
job_iteration_builder.build_once_enumerator(cursor: nil)
when ActiveRecord::Relation
job_iteration_builder.active_record_on_records(collection, cursor: cursor)
job_iteration_builder.active_record_on_records(collection, cursor: cursor, columns: cursor_columns)
when ActiveRecord::Batches::BatchEnumerator
if collection.start || collection.finish
raise ArgumentError, <<~MSG.squish
Expand All @@ -279,6 +291,7 @@ def enumerator_builder(cursor:)
collection.relation,
cursor: cursor,
batch_size: collection.batch_size,
columns: cursor_columns,
)
when Array
job_iteration_builder.build_array_enumerator(collection, cursor: cursor&.to_i)
Expand Down
32 changes: 32 additions & 0 deletions test/jobs/maintenance_tasks/task_job_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,38 @@ class << self
assert_equal 2, run.reload.tick_total
end

test "MaintenanceTasks::TaskJobConcern#build_enumerator provides cursor_columns as the column argument to active_record_on_records" do
cursor_columns = [:created_at, :id]

Maintenance::UpdatePostsTask.any_instance.stubs(cursor_columns: cursor_columns)

run = Run.create!(task_name: "Maintenance::UpdatePostsTask")

JobIteration::EnumeratorBuilder
.any_instance
.expects(:active_record_on_records)
.with(anything, has_entry(columns: [:created_at, :id]))
.returns(NullCollectionBuilder.new)

TaskJob.perform_now(run)
end

test "MaintenanceTasks::TaskJobConcern#build_enumerator provides cursor_columns as the column argument to active_record_on_batch_relations" do
cursor_columns = [:created_at, :id]

Maintenance::UpdatePostsInBatchesTask.any_instance.stubs(cursor_columns: cursor_columns)

run = Run.new(task_name: "Maintenance::UpdatePostsInBatchesTask")

JobIteration::EnumeratorBuilder
.any_instance
.expects(:active_record_on_batch_relations)
.with(anything, has_entry(columns: [:created_at, :id]))
.returns(NullCollectionBuilder.new)

TaskJob.perform_now(run)
end

test "array-based tasks have their count calculated implicitly" do
Maintenance::TestTask.any_instance.expects(:process).once.with do
@run.cancelling!
Expand Down
5 changes: 5 additions & 0 deletions test/models/maintenance_tasks/task_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -109,5 +109,10 @@ class TaskTest < ActiveSupport::TestCase
ensure
Maintenance::TestTask.throttle_conditions = []
end

test ".cursor_columns returns nil" do
task = Task.new
assert_nil task.cursor_columns
end
end
end

0 comments on commit 9ff94ee

Please sign in to comment.