Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locks not removed after worker finishes #758

Closed
slhck opened this issue Feb 24, 2023 · 6 comments
Closed

Locks not removed after worker finishes #758

slhck opened this issue Feb 24, 2023 · 6 comments

Comments

@slhck
Copy link
Contributor

slhck commented Feb 24, 2023

Describe the bug

Digests (locks) remain in redis even though workers have finished.

Expected behavior

I expect the locks to be gone so I can trigger a worker again.

Current behavior

I run some tests where the attribute finished_at should be calculated on a model. In the test this will be done once, initially, when the record is created, and later, when the record is updated and a method is called to re-calculate the attribute.

The first time, the worker runs and completes fine. However, after completing, the digests remain and I cannot call the re-calculation method again.

Worker class

class FinishedAtWorker
  include Sidekiq::Worker

  sidekiq_options lock: :until_and_while_executing,
                  lock_args_method: ->(args) { [args.first] },
                  on_conflict: :log
  sidekiq_options queue: :default, retry: 5

  def perform(id, force = false)
    logger.info("#{self.class} Starting for Measurement #{id}, force? #{force}")
    if force
      Measurement.find(id).calculate_finished_at!
    else
      Measurement.find(id).calculate_finished_at
    end
    logger.info("#{self.class} finished for Measurement #{id}")
  end
end

Config

SidekiqUniqueJobs.configure do |config|
  if Rails.env.test?
    config.logger = Rails.logger
    config.lock_info = true
  end
end

The Test

# the record is new
expect(Measurement.first.finished_at).to be(nil)

# the worker is scheduled
expect(FinishedAtWorker.jobs.size).to eq(1)

# let it run
FinishedAtWorker.drain

# 2023-02-24T14:14:51.838Z pid=6006 tid=13m INFO: FinishedAtWorker Starting for Measurement 1, force? false
# 2023-02-24T14:14:51.855Z pid=6006 tid=13m INFO: FinishedAtWorker finished for Measurement 1

# the worker did its job
expect(Measurement.first.finished_at).not_to be(nil)

# do something else to trigger the recalculation, e.g. manually, with the force parameter set to true
FinishedAtWorker.perform_in(1.minute, Measurement.first.id, true)

# Skipping job with id (fcf59d4b6a17d401c048ea84) because lock_digest: (uniquejobs:f887d9eb8648a8d9a8459667fc6858ce) already exists

expect(FinishedAtWorker.jobs.size).to eq(1)
# => this is 0, because the worker is not started

I see that in Redis, there are the locks:

uniquejobs:f887d9eb8648a8d9a8459667fc6858ce
uniquejobs:f887d9eb8648a8d9a8459667fc6858ce:INFO
uniquejobs:f887d9eb8648a8d9a8459667fc6858ce:LOCKED

And the info shows the correct args:

127.0.0.1:6379> GET uniquejobs:f887d9eb8648a8d9a8459667fc6858ce:INFO
""{\"worker\":\"FinishedAtWorker\",\"queue\":\"default\",\"limit\":null,\"timeout\":0,\"ttl\":null,\"type\":\"until_and_while_executing\",\"lock_args\":[1],\"time\":1677248404.6796517}"

But: When I manually run SidekiqUniqueJobs::Orphans::Reaper.call after the first .drain method of the worker, the test passes as expected.

I wonder: shouldn't the lock be removed when the job finishes?

Context

Ruby 3.1.4p223 with Gems:

rails (7.0.4.3)
sidekiq (7.0.8)
sidekiq-unique-jobs (8.0.2)
@asgeo1
Copy link

asgeo1 commented Mar 17, 2023

I had this exact same issue!

sidekiq 6.5.8
sidekiq-unique-jobs 7.1.29

I noticed this old issue that reported the same thing: #677

The reply was "I messed up client and server midlewares".

So I checked my own configuration, and low and behold, I had the exact same issue. The configuration for Sidekiq.configure_client and Sidekiq.configure_server I had backwards! i.e. what was supposed to go in one, I put in the other, and vice-versa.

Having it backwards like that, caused sidekiq-unique-jobs to behave exactly like you describe. I could start a job, but when it was finished the lock was left over, and nothing would remove it, preventing other instances of the job from starting.

Suggest you check your own configuration, since you haven't specified it above.

@slhck
Copy link
Contributor Author

slhck commented Mar 17, 2023

Thanks for the comment! This is my setup:

# frozen_string_literal: true

require 'sidekiq/api'

Sidekiq.configure_server do |config|
  config.redis = { url: ENV['REDIS_URL'] }
  config.logger = Logger.new(STDOUT)

  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
  end

  config.server_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Server
  end
  SidekiqUniqueJobs::Server.configure(config)
end

Sidekiq.configure_client do |config|
  config.redis = { url: ENV['REDIS_URL'] }

  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
  end
end

# prevent dying jobs from littering uniqueness keys
Sidekiq.configure_server do |config|
  config.death_handlers << ->(job, _ex) do
    digest = job['lock_digest']
    SidekiqUniqueJobs::Digests.new.delete_by_digest(digest) if digest
  end
end

I don't think that this is backwards — at least I have the client middleware where it says client …

@slhck
Copy link
Contributor Author

slhck commented Apr 13, 2023

Still observing this, and it is causing major headaches. My worker is drained but I still see the digests:

(byebug) MeasurementIndexWorker.drain
nil
(byebug) SidekiqUniqueJobs::Digests.new.entries
{"uniquejobs:f887d9eb8648a8d9a8459667fc6858ce"=>"1681369908.863178", "uniquejobs:8a93ed8d8dbf79eb3d27170cd590551d"=>"1681369908.8655057"}
(byebug) Measurement.find(1).last_indexed_at
Thu, 13 Apr 2023 07:11:49.051181000 UTC +00:00 # => the worker ran before and finished
(byebug) MeasurementIndexWorker.perform_async(1)
nil

@mhenrixon
Copy link
Owner

Hi @slhck,

If you want the jobs to clear as they are performed, you'd have to add the following:

Sidekiq::Testing.server_middleware do |chain|
  chain.add SidekiqUniqueJobs::Middleware::Server
end

Reference: https://github.com/sidekiq/sidekiq/wiki/Testing#testing-server-middleware.

It is flimsy at best to test uniqueness and should be tested in the gem rather than in your code. In my projects, I usually turn off sidekiq unique jobs for the rails test environment.

@slhck
Copy link
Contributor Author

slhck commented Apr 13, 2023

Thanks! Adding this do sidekiq.rb made all the tests pass again 🎉

I am aware of the hesitancy to add explicit testing, and I've read the README section on it. Admittedly, I didn't fully grasp it. The background issue is that I need some business logic tested. It somewhat depends on jobs being unique, and a specific amount of workers being instantiated. For instance, I want to ensure that when data comes in with a particular sequence, the workers are launched correctly (or prevented from being launched again if not needed). It has been hard to find the right lock and conflict strategies for some workers without trial and error, and there have been issues in production — so we'd like to bake this into the test suite.

@slhck slhck closed this as completed Apr 13, 2023
@mhenrixon
Copy link
Owner

mhenrixon commented Apr 13, 2023

I am aware of the hesitancy to add explicit testing, and I've read the README section on it. Admittedly, I didn't fully grasp it. The background issue is that I need some business logic tested. It somewhat depends on jobs being unique, and a specific amount of workers being instantiated. For instance, I want to ensure that when data comes in with a particular sequence, the workers are launched correctly (or prevented from being launched again if not needed). It has been hard to find the right lock and conflict strategies for some workers without trial and error, and there have been issues in production — so we'd like to bake this into the test suite.

If it is essential to business, you should test that everything works correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants