Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectionPool::TimeoutError and :until_executed #535

Closed
ArturT opened this issue Oct 5, 2020 · 3 comments
Closed

ConnectionPool::TimeoutError and :until_executed #535

ArturT opened this issue Oct 5, 2020 · 3 comments

Comments

@ArturT
Copy link
Contributor

ArturT commented Oct 5, 2020

Describe the bug
I see from time to time ConnectionPool::TimeoutError error because can't connect to Redis Cloud Heroku addon within 5 seconds. I've checked Redis Cloud latency metrics and it's most of the time < 0.07ms. I don't see peaks to 5s.

ConnectionPool::TimeoutError is raised only when I schedule async worker in the controller. I suspect this might be root problem method.

MyWorker.perform_async(args)

My controller gets at peak time ~80 requests per second.

I suspect when MyWorker.perform_async(args) called then maybe some heave operation is done on Redis and makes it timeout. Or maybe it's somehow related to lock?

From a business perspective, it would be fine to schedule a job twice if that would help avoid timeouts. I use :until_executed to ensure sidekiq is not flooded with a lot of jobs that would be enqueued at speed ~80 per second.

Expected behavior
I expect ConnectionPool::TimeoutError not happening.

Current behavior
I see ConnectionPool::TimeoutError errors every few days.

Worker class

class MyWorker
  include Sidekiq::Worker
  
  sidekiq_options queue: :low,
    # https://github.com/mhenrixon/sidekiq-unique-jobs#until-executed
    lock: :until_executed,
    # search in logs SidekiqUniqueJobs keyword to find duplicates
    log_duplicate_payload: true
 

  def perform(args); end
end

Additional context

# config/initializers/sidekiq.rb
sidekiq_redis_config = {
  url: Secret.sidekiq_redis_url,
  namespace: REDIS_SIDEKIQ_NAMESPACE,
  # tune network timeouts to be a little more lenient when you are seeing occasional timeout errors for Heroku Redis Cloud addon
  # https://github.com/mperham/sidekiq/wiki/Using-Redis#life-in-the-cloud
  network_timeout: 5 # <--- here is timeout limit set to 5 seconds. The default recommended by sidekiq is 1s. So I already use high limit.
}

Sidekiq.configure_server do |config|
  # A Sidekiq server process requires at least (concurrency + 5) connections
  # https://github.com/mperham/sidekiq/wiki/Using-Redis#complete-control
  config.redis = sidekiq_redis_config

  config.server_middleware do |chain|
    chain.add Sidekiq::Status::ServerMiddleware, expiration: 30.minutes # default
  end

  config.client_middleware do |chain|
    chain.add Sidekiq::Status::ClientMiddleware
  end
end
@mhenrixon
Copy link
Owner

Hi @ArturT, thanks for the report. If I am not mistaken, wouldn't this also be the same error as when the connection pool can't retrieve a connection in time? I don't think it is the actual connection to redis but rather the pool waiting to receive a slot to use for connecting.

What's your database pool size and how many connections does your redis instance support?

Again, if I recall correctly, if you truly want to use the connection pool it isn't enough to rely on anything automatic because in a rails app I think the database connection pool size or some default like that is used.

I remember having to fiddle with this in the past.

Could you just get back with any pool and time-out related settings for database, rack, puma etc so that we can get to the bottom of it?

@ArturT
Copy link
Contributor Author

ArturT commented Oct 5, 2020

Hi @mhenrixon

If I am not mistaken, wouldn't this also be the same error as when the connection pool can't retrieve a connection in time? I don't think it is the actual connection to redis but rather the pool waiting to receive a slot to use for connecting.

I guess you are right.

What's your database pool size and how many connections does your redis instance support?

I use Redis Cloud Heroku addon 100MB and 256 connections.

My app uses ~67 connections to Redis.

I use puma 1 thread (my app is not thread safe). 20 puma processes per dyno. I auto scale 1-3 web dynos hence you see >= 60 connections to Redis. 7 connections come from sidekiq dyno worker.

My current config looks like in this PR description

redis_connection = Redis.new(
  url: Secret.sidekiq_redis_url,
  # tune network timeouts to be a little more lenient when you are seeing occasional timeout errors for Heroku Redis Cloud addon
  # https://github.com/mperham/sidekiq/wiki/Using-Redis#life-in-the-cloud
  timeout: 5
)

REDIS_SIDEKIQ_NAMESPACE = "#{Secret.app_namespace}:#{Rails.env}:sidekiq"
REDIS_SIDEKIQ = Redis::Namespace.new(REDIS_SIDEKIQ_NAMESPACE, redis: redis_connection)


sidekiq_redis_config = {
  url: Secret.sidekiq_redis_url,
  namespace: REDIS_SIDEKIQ_NAMESPACE,
  # tune network timeouts to be a little more lenient when you are seeing occasional timeout errors for Heroku Redis Cloud addon
  # https://github.com/mperham/sidekiq/wiki/Using-Redis#life-in-the-cloud
  network_timeout: 5 # <--- here is timeout limit set to 5 seconds. The default recommended by sidekiq is 1s. So I already use high limit.
}

I found here https://github.com/mperham/sidekiq/wiki/Using-Redis#complete-control that you could define size on your own:

Connections are created on-demand so it's ok to specify a larger size (e.g. 20-30) if you aren't sure.

Sidekiq.configure_client do |config|
  config.redis = ConnectionPool.new(size: 5, &redis_conn)
end

But I don't use explicitly ConnectionPool in my config. I guess sidekiq does use it underhood. I see connection_pool gem is a dependency for sidekiq.

I found here https://github.com/mperham/sidekiq/wiki/Advanced-Options#connection-pooling
With a connection pool, you can share a limited number of I/O connections among a larger number of threads.

I use only 1 thread per puma so I guess I don't need a pool. Maybe the pool is needed when the connection is stuck somehow.

I've checked Redis Cloud slowlog and I see only a few slow commands in last week related to sidekiq-unique-jobs that took 10-15ms. But this is not even close to 5s.

Do you use ConnectionPool in your project?

@ArturT
Copy link
Contributor Author

ArturT commented Oct 5, 2020

I found out this: https://github.com/mperham/sidekiq/blob/d081a8a66a66d917727e3acb36d3016e940ebb83/lib/sidekiq/redis_connection.rb#L26

I have RAILS_MAX_THREADS=1 on Heroku. So it means for puma workers the ConnectionPool has pool 1. When the connection is stuck then there are no other connections in the pool.

It looks like sidekiq uses 1 connection per thread so in theory I have fine config but maybe it's better to allow more connections in the pool as they are lazy-loaded so when 1 connection is hanging then a new one could be opened.

I'm closing this issue for now. I found that I have another place in code where I have explicitly configured size to 1. I will increase it to see if this will help.

redis_conn = proc { REDIS_SIDEKIQ }

Sidekiq.configure_client do |config|
  threads_count = ENV.fetch('RAILS_MAX_THREADS') { 1 } # <--- here I could use bigger size than 1
  config.redis = ConnectionPool.new(size: threads_count, &redis_conn)

  config.client_middleware do |chain|
    chain.add Sidekiq::Status::ClientMiddleware
  end
end

@mhenrixon Thanks for your help and pointing out size problem.

@ArturT ArturT closed this as completed Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants