Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs scheduled in the future are never run #93

Closed
amiel opened this issue Jul 23, 2015 · 15 comments
Closed

Jobs scheduled in the future are never run #93

amiel opened this issue Jul 23, 2015 · 15 comments

Comments

@amiel
Copy link

amiel commented Jul 23, 2015

Here's what I think is happening:

  1. When the job is scheduled (Worker.perform_in 1.minute, args), it runs through the clients client middleware, setting the unique lock.
  2. Once the scheduled time is up, it runs through the servers client middleware and considered not unique.

The servers client middleware was introduced in ea06539. Could it be safely removed? Or am I misunderstanding the issue?

@amiel amiel changed the title Jobs scheduled in the future is never run Jobs scheduled in the future are never run Jul 23, 2015
@amiel
Copy link
Author

amiel commented Jul 23, 2015

BTW, I'm happy to work on a pull-request for this, but wanted to start a discussion first to make sure I understand the context.

@mhenrixon
Copy link
Owner

Think @pik is onto something in regards to this problem. Keep an eye out for #96

@amiel
Copy link
Author

amiel commented Jul 27, 2015

@mhenrixon thanks for the update.

@mhenrixon
Copy link
Owner

Could you post a failing test for this @amiel ?

@amiel
Copy link
Author

amiel commented Jul 31, 2015

@mhenrixon I'll give it a shot. Do you have a suggestion as to where this test should live?

@mhenrixon
Copy link
Owner

Just add it to the issue description and tell me when its done. Oh and please base it on current master!

@amiel
Copy link
Author

amiel commented Jul 31, 2015

@mhenrixon ok, will do.

@amiel
Copy link
Author

amiel commented Jul 31, 2015

@mhenrixon I'm having a hard time replicating this issue in tests.

However, I can replicate it with the current master.

Here's what I've done to troubleshoot:

  1. I have a worker with the following options:

    sidekiq_options queue: :low, retry: 2, unique: true, log_duplicate_payload: true,
    unique_args: :unique_args, unique_job_expiration: 1.day, unique_unlock_order: :never

    Note: my unique_args just uses the first argument (it ignores the second argument).

  2. In addition to enabling log_duplicate_payload, I have added a debugging statement in MyWorker#perform.

  3. Everything seems to work as expected using perform_async:

    MyWorker.perform_async 1
    # => jid
    # Server log shows `MyWorker#perform` was called
    MyWorker.perform_async 1
    # => nil
    # Client log includes "payload not unique"
    MyWorker.perform_async 2
    # => jid
    # Server log shows `MyWorker#perform` was called
  4. However, using perform_in:

    MyWorker.perform_in 1, 3
    # => jid
    # A short while later, server log includes "payload not unique"
    MyWorker.perform_in 1, 3
    # => nil
    # Client log includes "payload not unique"
    MyWorker.perform_in 1, 4
    # => jid
    # A short while later, server log includes "payload not unique"
  5. If I comment out

    config.client_middleware do |chain|
    require 'sidekiq_unique_jobs/client/middleware'
    chain.add SidekiqUniqueJobs::Client::Middleware
    end
    , so it looks like this:

    require 'sidekiq'
    
    Sidekiq.configure_server do |config|
     config.server_middleware do |chain|
       require 'sidekiq_unique_jobs/server/middleware'
       chain.add SidekiqUniqueJobs::Server::Middleware
     end
     # config.client_middleware do |chain|
     #   require 'sidekiq_unique_jobs/client/middleware'
     #   chain.add SidekiqUniqueJobs::Client::Middleware
     # end
    end
    
    Sidekiq.configure_client do |config|
     config.client_middleware do |chain|
       require 'sidekiq_unique_jobs/client/middleware'
       chain.add SidekiqUniqueJobs::Client::Middleware
     end
    end

    Then, it works as I expect:

    MyWorker.perform_in 1, 3
    # => jid
    # A short while later, server log shows that `MyWorker#perform` was called
    MyWorker.perform_in 1, 3
    # => nil
    # Client log includes "payload not unique"
    MyWorker.perform_in 1, 4
    # => jid
    # A short while later, server log shows that `MyWorker#perform` was called

Here is the test I wrote, but I could not get it to fail:

require 'spec_helper'
require 'sidekiq/api'
require 'sidekiq/worker'
require 'sidekiq-unique-jobs'
require 'sidekiq/scheduled'

describe 'Unique scheduled jobs' do
  class TestWorker
    include Sidekiq::Worker

    sidekiq_options unique: true, queue: 'default',
      unique_unlock_order: :never, unique_unlock_expiration: 10

    def perform
    end
  end

  describe 'with real redis' do
    let(:scheduled) { Sidekiq::ScheduledSet.new }
    let(:enqueuer) { Sidekiq::Scheduled::Enq.new }
    let(:past) { (Time.now - 60).to_f }

    before do
      Sidekiq.redis = REDIS
      Sidekiq.redis(&:flushdb)
    end

    it 'enqueues a scheduled job' do
      scheduled.schedule past, 'class' => TestWorker.name, 'args' => []
      enqueuer.enqueue_jobs
      default_queue = Sidekiq::Queue.new('default')

      expect(default_queue.size).to eq 1
    end
  end

end

I did notice this spec. My assumption is that the tests aren't able to replicate the issue as they only use the client middleware stack and not the server middleware stack, but I don't understand what else could be contributing to this issue.

@mhenrixon Would it help if I could create an example application?

Thanks for taking the time to look in to this :)

@rloomba
Copy link

rloomba commented Jul 31, 2015

I can confirm that I observe this same behavior in our own application. Any workers that use perform_in do not work while workers that use perform_async do work.

@mhenrixon
Copy link
Owner

Example app works as good as a failing test @amiel

@amiel
Copy link
Author

amiel commented Aug 1, 2015

@mhenrixon Try this out: https://github.com/amiel/sidekiq-unique-scheduled-job-example

Let me know if you need any more details. Thanks again for looking in to it 😀

@frsantos
Copy link

frsantos commented Aug 4, 2015

This is related to #97. The problem is that in Sidekiq::Scheduled::Enq.enqueue_jobs the jobs are not unlocked before pushing them to the queue, and the client middleware sees them as duplicated.

@pik
Copy link
Contributor

pik commented Aug 9, 2015

@mhenrixon This is because middleware is invoked twice for a scheduled job right? Once to schedule and once to push it into it's queue once the at key has been deleted. https://github.com/mperham/sidekiq/blob/master/lib%2Fsidekiq%2Fclient.rb#L162

If the intent is to actually have one mutex for scheduled and queued it would be simple to check if the current_job is the owner of the mutex (compare JID) and let it run if it is (instead of using setnx). Although it may be quite sub-optimal to use the same mutex for scheduled and queued jobs to begin with. Thoughts?

@mhenrixon
Copy link
Owner

Check against master and report back if this is still an issue. The recent commits should have fixed this issue.

@amiel
Copy link
Author

amiel commented Sep 6, 2015

@mhenrixon Yay! I'm not currently developing any projects that use sidekiq-unique-jobs right now, but I'll report back when I do :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants