Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.* is hurting background job workers performance #127

Closed
PikachuEXE opened this issue Oct 13, 2015 · 20 comments
Closed

4.* is hurting background job workers performance #127

PikachuEXE opened this issue Oct 13, 2015 · 20 comments

Comments

@PikachuEXE
Copy link

Background

We upgrade the gem to 4.* this week
And we use New Relic to monitor performance

Issue

We are seeing jobs being run 6~7 minutes for all small jobs
But by tailing the live log (STDOUT) of workers, they actually take much less time (< 1 min)
So we look at New Relic to try to find out what's going on, and we found the following data:

Before upgrade

screen shot 2015-10-13 at 1 20 23 pm
screen shot 2015-10-13 at 1 20 47 pm

After upgrade

screen shot 2015-10-13 at 1 21 29 pm
screen shot 2015-10-13 at 1 21 36 pm

@mhenrixon
Copy link
Owner

Doesn't really say much though. You'd have to share your job configuration as well. Only thing I can think of is if all your jobs are 'unique: :while_executing'. That is a lock that uses a mutex and will keep trying. The rest of the code should be faster since it is doing fewer redis requests. (1 request for lock, 1 for unlock instead of multiple for lock and a gazillion for unlock). I chose the default lock type to be while executing to get the most feedback. Maybe that was a bad choice.

Can you share your sidekick config, how you run your no config eg Foreman and some of the workers in question so that I can create a rails app trying to replicate the problem?

Mikael Henriksson
Sent with Airmail

@PikachuEXE
Copy link
Author

I remove all config about unique jobs from all my workers.
So that would make all my workers unique by default? (and using while_executing)

My initializer for sidekiq:

require 'try_to'

# To config Redis server, please use REDIS_PROVIDER or REDIS_URL
# https://github.com/mperham/sidekiq/wiki/Advanced-Options#setting-the-location-of-your-redis-server

# Assign new concurrency from ENV if it equals to the default (25)
# @see https://github.com/mperham/sidekiq/wiki/Advanced-Options
DEFAULT_CONCURRENCY = 25
is_default_concurrency = (Sidekiq.options[:concurrency] == DEFAULT_CONCURRENCY)
sidekiq_concurrency_number = is_default_concurrency ? ENV![:SIDEKIQ_CONCURRENCY] : Sidekiq.options[:concurrency]

Sidekiq.configure_server do |config|
  config.redis = {
    namespace: "#{Rails.application.class.parent_name.underscore}:#{Rails.env.to_s.underscore}:sidekiq",
    size: sidekiq_concurrency_number + 1,
    url: Rails.configuration.redis_url,
  }

  # Change DB pool size for ActiveRecord
  # Replace with MongoDB or whatever
  if defined?(ActiveRecord::Base)
    # Note we use `ActiveRecord::Base.connection_pool`
    # Not `ActiveRecord::Base.connection`
    # Not sure why
    ActiveRecord::Base.connection_pool.disconnect!
    # Could be nil in 4.1
    config = ActiveRecord::Base.configurations[Rails.env] || {}
    config['reaping_frequency']   = ENV![:DATABASE_REAPING_FREQUENCY] # seconds
    config['pool']                = sidekiq_concurrency_number + 2 # Give it some buffer
    config['prepared_statements'] = ENV![:DATABASE_USE_PREPARED_STATEMENTS]
    ActiveRecord::Base.establish_connection(config)
    Rails.logger.info('Connected to ActiveRecord in Sidekiq')
  end
end

Sidekiq.configure_client do |config|
  config.redis = {
    namespace: "#{Rails.application.class.parent_name.underscore}:#{Rails.env.to_s.underscore}:sidekiq",
    size: 1, # Client only needs one connection
    url: Rails.configuration.redis_url,
  }
end

Sidekiq.options[:concurrency] = sidekiq_concurrency_number
# Sidekiq got a bug for this default option
# Some gems are depending on the default queue value to override the queue name
# So we don't set it here
# Default queue name is `default` in `String`
Sidekiq.default_worker_options = { retry: false, backtrace: true }.stringify_keys

My sidekiq config files (yes I got multiple files) are only for storing queue names for a worker to process.

Please tell me what else do you need.

@wpolicarpo
Copy link

@PikachuEXE did you notice any performance drop for jobs that were not using unique middleware?

@mhenrixon
Copy link
Owner

@PikachuEXE yes that would make the jobs unique: :while_executing which would explain your problem. I'll investigate this and see what can be done in terms of performance but I think it is a sensible default but maybe it should be some sensible way of breaking out of the loop.

In terms of performance it looks like it is extremely high considering your number of calls to evalsha is 298 000 instead of the around 4 000 (to get/set/multi and friends) that was before the upgrade so I am in fact pretty happy with that. It is unfortunate with your situation though that the locking is done.

There are a few ways forward on this as I see it.

  1. Some time of retry configuration (yikes configuration 😢) for the worker
  2. Something like sliding expiration for retrying to acquire the lock. Think how login methods should to be really secure add some additional wait between retries and then lock out the user to be really secure.
  3. Putting the job back (yikes what if the jobs sort of needs to be processed in order)

Out of those options I only really view number 3 as a viable solution for your case since with any combination of the other two would cause the jobs to appear to complete slow when in fact it is just waiting to be able to lock the job. Thoughts?

@PikachuEXE
Copy link
Author

I now removed the gem from production (and only load it in test for matchers)
The "graph" is normal again
I never expect the gem would set such "aggressive" option by default.
I think a gem should make users opt-in instead of opt-out.

@PikachuEXE
Copy link
Author

Before putting this gem back on production, I want to know:

  • How to opt-out by default (I can't see how to opt-out in README)
  • The possible performance impact of each option (how often the custom scripts are called)

@mhenrixon
Copy link
Owner

@PikachuEXE I don't understand what you mean? If you don't configure unique: :while_executing the gem won't be used.

@PikachuEXE
Copy link
Author

I have SidekiqUniqueJobs.config.unique_args_enabled = true in initializer
Other then that no other default worker config about uniqueness is supplied (I intentionally removed them after upgrade)

@mhenrixon
Copy link
Owner

@PikachuEXE I really don't understand what is going on then because unique_args_enabled was deprecated in 4.0.4. What does your worker really look like? You still haven't posted that code.

@mhenrixon
Copy link
Owner

oh darn you are right, I have accidentally removed the check if unique is enabled in the server middleware. Will have a fix in a few minutes!

@mhenrixon
Copy link
Owner

@PikachuEXE v4.0.6 should fix your issue

@mhenrixon mhenrixon reopened this Oct 14, 2015
@mhenrixon
Copy link
Owner

Sorry partly fix your issue! :) There is one more thing to do!

@wpolicarpo
Copy link

@mhenrixon maybe you should yank all 4.0.x versions affected by this bug from rubygems. What do you think?

I had problems with 4.0.2 too. That's why I asked @PikachuEXE about having problems with non-unique jobs.

@mhenrixon
Copy link
Owner

So this should be closed now, I'll see about adding some type of running application eventually. Both for tracking performance and for spotting eager loading issues (missing constants).

@mhenrixon
Copy link
Owner

@PikachuEXE would it be possible to get verified that it is working better for you now. v4.0.7 is the one.

@Bertg
Copy link

Bertg commented Oct 14, 2015

Any ETA on the arrival of v4.0.7 ?

@PikachuEXE
Copy link
Author

I can't verify anything until 4.0.7 is released :o

@mhenrixon
Copy link
Owner

😢 that was lame! I forgot to actually release the gem... @Bertg @PikachuEXE it has been released now.

@PikachuEXE
Copy link
Author

Will test it soon

@PikachuEXE
Copy link
Author

Performance seems fine when requiring the gem without any extra config.
I don't have plan to test the uniqueness options yet (the project might not even need it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants