-
-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expiration for all locks #393
Comments
The problems described is due to bugs in the scripts. It will not be a problem in the near future as there will be a cleanup running on I'll hold off on your suggestion until that work is complete but I will leave this issue open until we see if the problem is solved in a better way or not. |
This should be fixed now that #387 got merged. No release yet, still have a few things to work through first. |
@mhenrixon this has been happening a lot to us on Production recently and is breaking our crons (which use https://github.com/ondrejbartas/sidekiq-cron). It will be nice to disregard the lock after a specific amount of time and make it configurable. What's the recommended way to achieve this, instead of manually deleting the locks? If there isn't a way, I'll have to write a cron that looks at redis for the keys and clears out old lock keys. This has been giving us too much pain recently. |
@amitsaxena though it might not be a global setting |
Ahh! That seems to be missing from So I am expecting if there is a lock in redis for more than 1 minute and I schedule a new job, it should get added to sidekiq. It doesn't seem to happen and an exception is raised instead. I tried with the job and then deleted the job directly from redis so that the lock still stays (and is orphaned), but still the new job doesn't get added. We are using |
I'm a little confused, can you clarify how to use |
@aharpervc I couldn't find anything which works (in v6), so I had to write a script which looks for redis keys (related to locks) and deletes them. |
Sounds useful, are you able to share it? |
Something on the lines of: class CleanStaleLocks
include Sidekiq::Worker
def perform
SidekiqUniqueJobs::Digests.all.each do |digest|
delete(digest) if stale?(digest)
end
end
private
def delete(digest)
$redis.del("#{digest}:EXISTS")
$redis.hgetall("#{digest}:GRABBED").keys.each do |k|
$redis.hdel("#{digest}:GRABBED", k)
end
# Check for the correct key name in redis and use below
$redis.srem("project_name:sidekiq:unique:keys", digest)
end
def stale?(digest)
jid = $redis.get("#{digest}:EXISTS")
return false if Sidekiq::Queue.new("queue_name_with_unique_jobs").find_job(jid)
grabbed = $redis.hgetall("#{digest}:GRABBED")
raise "Expected redis key not found" if grabbed.nil?
locked_at = Time.at(grabbed[jid].to_f)
# assuming lock is stale if it is older than 1 hour
(Time.now - locked_at) > 1.hour.to_i
end
end You will need to tweak the code depending on what queues could have unique jobs in your case. I'd then add it as a cron job running every hour (or more frequently). |
Is your feature request related to a problem? Please describe.
I find it confusing that lock expiration doesn't allow me to expire locks after some long period of time (for instance, 2 days) when using strategies other than
until_expired
. I'm worried that if something goes wrong and a lock isn't released (or a job is suddenly killed mid-execution and it has a runtime lock), those locks will linger forever and the same job can never again be enqueued.Based on past issues it looks like this has happened before (for example: #277), so I think this concern is well-founded.
Describe the solution you'd like
I'd like
lock_expiration
to simply be the Redis TTL that all locks use, without any fanciness. (For backwards compatibility, this can also be called something else if that's clearer.) Honestly I think it even makes sense to have this behavior enabled by default for all jobs, since a lock that's been present for many days is probably the sign of a bug and should be released.Describe alternatives you've considered
We're using Sidekiq Pro's
super_fetch
reliability feature which addresses the same issue from a different angle, but I think it's still valid to want locks to disappear after some long amount of time.The text was updated successfully, but these errors were encountered: