Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs queued during existing job inherit lock digest #766

Closed
JeremiahChurch opened this issue Mar 23, 2023 · 1 comment · Fixed by #774
Closed

Jobs queued during existing job inherit lock digest #766

JeremiahChurch opened this issue Mar 23, 2023 · 1 comment · Fixed by #774

Comments

@JeremiahChurch
Copy link
Contributor

JeremiahChurch commented Mar 23, 2023

Describe the bug
JobA calls JobB.perform_async
JobB gets a lock digest in the job hash - because the lock digest exists the middleware doesn't create a new lock digest,etc and things don't go well.

Expected behavior
unique lock digests per job/args combo

Current behavior
Jobs chain the same lock digest. I don't think this happens 100% of the time but it's frequent.

sidekiq-unique-jobs 8.0.2
sidekiq 7.0.7

couple of ways to show this off, below is with another middleware, monkey patching the middleware with a log before the lock digest check also shows the same

class LogClient
  include Sidekiq::ClientMiddleware

  def call(_worker, job, _queue, _redis_pool)
    Rails.logger.info "LogClient: #{job['lock_digest']} || #{job['jid']}" if job['lock_digest'].present?
    yield
  end
end

Sidekiq.configure_server do |config|
  config.queues = %w[list]
  config.concurrency = 20

  config.capsule('limited') do |cap|
    cap.concurrency = 8
    cap.queues = %w[special]
  end

  config.client_middleware do |chain|
    chain.add LogClient
    chain.add SidekiqUniqueJobs::Middleware::Client
  end

  config.server_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Server
  end

  SidekiqUniqueJobs::Server.configure(config)
end

Sidekiq.configure_client do |config|
  config.client_middleware do |chain|
    chain.add LogClient
    chain.add SidekiqUniqueJobs::Middleware::Client
  end
end

class ItemCreateJob < ApplicationWorker
  sidekiq_options lock: :until_and_while_executing
  def perform
    UserItemCreateJob.perform_async 1
  end
end

class UserItemCreateJob< ApplicationWorker
  sidekiq_options lock: :until_and_while_executing
  def perform(user_id)
    logger.error user_id
  end
end

irb

ItemCreateJob.perform_async
=> "381d4388288ab468ba5a8e2a "

logs (issue doesn't occur for job that comes out of irb/outside of sidekiq - just on the sidekiq server side

2023-03-23T03:28:09.751Z pid=1608050 tid=10bee class=ItemCreateJob jid=381d4388288ab468ba5a8e2a INFO: start
2023-03-23T03:28:09.752Z pid=1608050 tid=10bee class=ItemCreateJob jid=381d4388288ab468ba5a8e2a INFO: [UNIQUE] before prepare item prod_uniq:22616be5af91c8e73845991af942d8f8

2023-03-23T03:29:43.843Z pid=1608050 tid=10bbq INFO: LogClient: prod_uniq:22616be5af91c8e73845991af942d8f8 || 0b258bc2c01990c7dc9ebec9

in prod we also see lock digests coming off of jobs to other jobs that don't have themselves have lock methods themselves (JobA has no lock but JobB does, I haven't nailed down where they are coming from but it's possible JobC calls JobA and JobC has a lock

raw prod debug logs with just the job names changed - note jid & job class change but lock_digest stays the same. WorkerA has no log args, WorkerB is a basic lock: :until_and_while_executing

Mar 22 20:02:37 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:37.636Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] soft Deleting lock for prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:37 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:37.708Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] lock! prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:37 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:37.708Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] locked? prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:37 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:37.708Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] taken? prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:38 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:37.742Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] enqueue prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:38 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:38.068Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] write_lock_info prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:38 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:38.245Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] primed_sync prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:38 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:38.266Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] pop_queued prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:38 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:38.271Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] rpoplpush prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:39 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:38.276Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] handle_primed prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:02:39 ip-172-31-11-45 sidekiq:   2023-03-22T20:02:39.068Z pid=28526 tid=1p6y class=WorkerA   jid=b2a6c29d05fb39da586c4b6f uniquejobs=client   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   UNIQUE_LOG_WATCH locked for WorkerB args [1493] lock   prod_uniq:3a2c7382851c8df6787efd946aabda18 jid 1032cd10e20bcf5e814d0749
Mar 22 20:09:07 ip-172-31-4-102 sidekiq:   2023-03-22T20:09:07.949Z pid=12932 tid=1z60 class=WorkerB   jid=1032cd10e20bcf5e814d0749 INFO: start
Mar 22 20:09:07 ip-172-31-4-102 sidekiq:   2023-03-22T20:09:07.966Z pid=12932 tid=1z60 class=WorkerB   jid=1032cd10e20bcf5e814d0749 uniquejobs=server   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] unlock prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:09:07 ip-172-31-4-102 sidekiq:   2023-03-22T20:09:07.966Z pid=12932 tid=1z60 class=WorkerB   jid=1032cd10e20bcf5e814d0749 uniquejobs=server   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] locked? prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:09:07 ip-172-31-4-102 sidekiq:   2023-03-22T20:09:07.966Z pid=12932 tid=1z60 class=WorkerB   jid=1032cd10e20bcf5e814d0749 uniquejobs=server   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   [UNIQUE] taken? prod_uniq:3a2c7382851c8df6787efd946aabda18   1032cd10e20bcf5e814d0749
Mar 22 20:09:07 ip-172-31-4-102 sidekiq:   2023-03-22T20:09:07.971Z pid=12932 tid=1z60 class=WorkerB   jid=1032cd10e20bcf5e814d0749 uniquejobs=server   until_and_while_executing=prod_uniq:3a2c7382851c8df6787efd946aabda18 INFO:   UNIQUE_LOG_WATCH unlock_failed for WorkerB args [1493] lock   prod_uniq:3a2c7382851c8df6787efd946aabda18 jid 1032cd10e20bcf5e814d0749
Mar 22 20:09:07 ip-172-31-4-102 sidekiq:   2023-03-22T20:09:07.973Z pid=12932 tid=1z60 class=WorkerB   jid=1032cd10e20bcf5e814d0749 elapsed=0.025 INFO: done

any ideas or pointers on where to look would be much appreciated

@mhenrixon
Copy link
Owner

mhenrixon commented Apr 13, 2023

Check #774 if that helps you at all @JeremiahChurch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants