Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Q&A] Performance & Dead Locks #652

Closed
OskarEichler opened this issue Nov 11, 2021 · 4 comments
Closed

[Q&A] Performance & Dead Locks #652

OskarEichler opened this issue Nov 11, 2021 · 4 comments

Comments

@OskarEichler
Copy link

OskarEichler commented Nov 11, 2021

Hey guys,

First up great work on this gem! We are very interested in deploying this on our production setup but have two concerns / questions we would like to clarify:

  1. Performance

We are processing around 10 Million Jobs per day, and are using a lot of push_bulk to keep the load on Redis as minimal as possible. Do you have any performance benchmarks on how many resources / extra time the gem needs to process each job?

We have queues with over 3 Million scheduled items at a time and need to make sure everything is running smoothly. Also, how much additional storage is being used on average? We are mainly concerned about Redis performance.

  1. Dead Locks

As far as we understand 'until_executed' requires for jobs to actually be performed in order for the lock to be lifted. What happens when queues are manually being cleared out / deleted. Will these jobs be stuck forever, or is there a way of synchronizing the enqueued jobs on a regular basis? Or even better remove the lock the moment a job is being deleted from the queue?

I guess the only other workaround would be to set an additional lock_timeout, but settings this to 7 days would result in quite some daily dataloss.

Please let us know your thoughts!

All the best,
Oskar

@mhenrixon
Copy link
Owner

  1. I don't have any performance metrics yet but I have been meaning to gather some such. There are some hooks in place to that can be used to fiddle around with this but I haven't done anything with them yet.
  2. It depends on how you clear your queues. If you use the sidekiq/api there are hooks in place that also clear the locks.

Due to that, the gem uses LUA script, I have no idea how it will perform in such a setup but I believe there be dragons. Knapsack PRO uses this gem under the hood but I don't know how hard their workers are utilized every day. Perhaps @ArturT can offer better guidance than me on the performance?

@OskarEichler, I'll see if I can get around to gathering some metrics.

@OskarEichler
Copy link
Author

OskarEichler commented Nov 11, 2021

Amazing thank you @mhenrixon.

We need to be cautious of deploying this on a high load system, as we want to prevent it completely blowing up our server. Doing so many extra calls to Redis for every scheduled job could definitely be causing some issues - would be great if you can share your metrics / experience here.

For push_bulk is there some sort of bulk_check method you are using to check lets say 1.000 jobs at once and clear them out with single operations, or are you looping through them one by one? Please let me know!

All the best,
Oskar

@ArturT
Copy link
Contributor

ArturT commented Nov 12, 2021

Hi all

I can share a few info about sidekiq-unique-jobs gem that we use at https://knapsackpro.com/

We process a few million jobs per day with no problem.

From my experience sidekiq-unique-jobs impact on Redis performance is negligible. If you would worry about something it would be the number of keys (locks) stored in Redis and memory consumption.

sidekiq-unique-jobs gem already clears locks/Redis memory after itself so nothing to worry about. Once I had a situation that after the sidekiq-unique-jobs gem upgrade the Redis memory was not cleaned up and it was growing and growing but the bug was already fixed in sidekiq-unique-jobs gem.

Regarding Ruby vs Lua. We use config.reaper = :ruby. I do not recommend using :lua. It killed our Redis server instance immediately.

Here is my config

SidekiqUniqueJobs.configure do |config|
  config.debug_lua       = false # true for debugging
  config.lock_info       = false # true for debugging
  config.max_history     = 1000  # keeps n number of changelog entries
  # WARNING: never use :lua because this will lead to error and fail of our API
  # Redis::CommandError: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
  config.reaper          = :ruby # also :lua but that will lock while cleaning
  config.reaper_count    = 1000  # Reap maximum this many orphaned locks
  # Do not use too low reaper_interval value.
  # Reaper should have enough time to scan all Redis keys because it is slow for large data set.
  # https://github.com/mhenrixon/sidekiq-unique-jobs/issues/571#issuecomment-777053417
  config.reaper_interval = 600  # Reap every X seconds
  # reaper_timeout should be close to reaper_interval value to avoid leaking threads
  # https://github.com/mhenrixon/sidekiq-unique-jobs/issues/571#issuecomment-777003013
  config.reaper_timeout  = 595  # Give the reaper X seconds to finish
end

The biggest risk I see in using sidekiq-unique-jobs gem is to be very careful when upgrading this gem. We had many times situation that jobs were skipped because the new sidekiq-unique-jobs gem version introduced regression bugs.
Our test suite can't detect this easily so manual testing is needed to verify that nothing is wrong after each sidekiq-unique-jobs gem version upgrade.

Other than that sidekiq-unique-jobs gem is super helpful because we often use it to skip a large number of jobs to reduce sidekiq/ActiveRecord impact on DB and worker CPU resources consumption.

I hope this helps. :)

@mhenrixon
Copy link
Owner

Thanks @ArturT! I'm taking a note on the upgrade thing ❤️

Repository owner locked and limited conversation to collaborators Nov 15, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants