-
-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scheduled jobs breaking with new_unique_for method #101
Conversation
* set_nx is not used except for scheduling a job * a job is allowed to be queued if a job with the same payload_ahsh is already scheduled but not vice-versa * fixes jobs scheduled with perform_in / perform_at not being run * use lua script in new_unique_for
* no idea why it started now
pid = conn.get(payload_hash).to_i | ||
if pid == 1 || (pid == 2 && item['at']) | ||
pid = conn.get(payload_hash) | ||
if pid && (pid != 'scheduled' || item['at']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry but if != ||
== WAT?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code was a mess already. What I'd like to see is something more in the lines of:
def old_unique_for?
connection do |conn|
conn.watch(payload_hash)
pid = conn.get(payload_hash)
return create_lock(conn) unless pid == 'scheduled' || item['at']
conn.unwatch
end
end
def create_lock(conn)
conn.multi do
return clear_exired_lock(conn) unless expires_at > 0
conn.setex(payload_hash, expires_at, item['at'] ? 'scheduled' : item['jid'])
end
end
def clear_expired_lock(conn)
conn.del(payload_hash) # already expired
end
Thoughts on that?
Love the contribution! Had a few notes but if something is out of order or doesn't make sense just say the word and I will spend some time when I am back from the dead. |
@mhenrixon Thanks for the feedback on this. I'm going to try and go through my understanding of what's happening just to make sure this is the correct fix and it's not introducing any subtle bugs. The old_unique_for never removed the locks for a scheduled job - since it was possible to just promote (over-write) a scheduled lock to a queued-lock (I think these were As far as I understand the Sidekiq behaviour: So the
The reason SETNX was not working with the
New Version:
Not checking whether Job Timeout and Mutex belongs to it or not,Fixed by checking JID b51e733
RunLock PR: #99
|
There is still a problem with #98. The thing we need to do is to create a composite key with the time to run the job and the argument. We need to then make sure that the jobs unique arguments are used for locking the run as to prevent the same job running twice (simultaneously) if the first job gets delayed. Does that make sense at all? I don't see the code taking this into consideration. |
@mhenrixon Are you sure? All of the jobs in #98 run correctly for me with this patch.
This should already happen because an JID type lock can only over-ride a 'scheduled' lock, so when the 'scheduled' job comes up and runs middleware again either it can promote it's previous 'scheduled' lock to an JID or a JID will already be there and it will fail. I can try and add some more specs for this? |
8608d35
to
6afc38a
Compare
* split old_unique_for? into unique_schedule_old and unique_enqueue_old
6afc38a
to
c062cac
Compare
I'd like to propose another solution. #105 |
@deltaroe That doesn't solve anything - please see failing specs, I added a reversion for it in #107. @mhenrixon RE: Tests |
@pix It sounds like we have different use cases. In our environment we don't want a job to be able to run now if it's scheduled to run in the future. This was brought up by my colleague in #91, however PR #96 changed the behavior to what we expected and was merged between when the issue was reported and @mhenrixon tried to reproduce the issue. I wonder if adding some method to specify which mode is desired would make sense here |
#93 #98 So having dug into the code it looks like
set_nx
was not a viable option fornew_unique_for
and neither was my update to the oldunique_for
whereby it was altered to always use item['jid'] without a special assignment if it's being scheduled rather than enqueued. This patch fixes the broken behaviour for bothnew_unique_for
andold_unique_for
.