-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.x] Add ability to dispatch unique jobs #35042
Conversation
I think this option is better because when a job is dispatched from multiple locations the key is at one place; the job itself. But I think the interface is too explicit, checking for the method existence is enough. But still; I don't think having the lock in the cache is the way to go. Having |
This is for backwards-compatibility. Otherwise, we would have to target master as it would be a breaking change - anyone who's implemented the
Unfortunately, there's no way around this. Laravel doesn't treat lock as a first class citizen (e.g. there is no |
@paras-malhotra Can you describe the behavior in a failure state? If a job fails will it release the lock? And then try to re-gain the lock on its next retry? From my brief reading of the code it appears that will be the case because of the |
I would also be curious to know how this behaves in the context of job batches and job chains. |
From my exploration, it looks like if a chain encounters a unique job that can't acquire a lock the chain would just end there. With batches, I think unique jobs would run because they are often bulk pushed onto the queue and don't go through the normal dispatch flow. I'm not sure what we want the behavior to be. I don't think unique jobs makes much sense in the context of batches but I'm not sure if people have a use case there. I'm honestly not sure if they make sense as part of chains either but I think it's more likely to happen than batches. As it stands right now we would need to document that unique jobs should not be included in batches or chains. |
@taylorotwell, unique jobs aren't supported in batches or chains for now. Sidekiq doesn't support them either as mentioned in the notes section of the docs on batches. Currently, how the PR works is that the lock is acquired when dispatching and the lock is released regardless of success/failure/retry. When a job exception occurs, the job is released back to the queue, and since it's not technically "dispatched" it will not acquire a lock again while retrying. I'll need to change the PR to say that the lock is only released if the job has succeeded or "failed" (hit max exceptions or max retries), but will not be released if the job is released back to the queue (as the lock will not again be acquired on retry). Does that behaviour sound good? |
Hmm, well it seems strange that on exception the lock would be released and then the job would not attempt to reacquire the lock when it is retried? It seems like this could allow for multiple of a unique job to exist on the queue at the same time? |
@taylorotwell, here are the 3 scenarios:
So, in all 3 cases, there is no chance of multiple unique jobs to exist on the queue at the same time. |
For scenario #2 - how is the lock not released? I see the lock release is the For scenario #3, where in the code is the lock released when the job fails? |
@taylorotwell I'd change the code accordingly if you're okay with this behaviour. Right now, the code doesnt look like this. I just wanted to confirm if this behaviour is acceptable. If yes, I'll modify the PR accordingly. |
Ah OK. Yes, those 3 scenarios sound good as described. |
Instead of having Also I think having an interface is not necessary, checking if there's a unique ID set for the job is enough IMO. |
@themsaid I have added The @taylorotwell if you would like to go back to the To summarise the behaviour, now there are 4 scenarios:
Tests are added for all 4 scenarios. 🚀 Let me know if any further changes are needed. |
@themsaid @paras-malhotra there is a use case for the interface IMO. For example, if you mark the class with the interface but do not define any additional methods it will lock by the class name meaning only 1 job of that entire type will be allowed on the queue at once. That is a valid use case IMO. |
@taylorotwell hmmm yes that's a valid use case but wouldn't you want to define a lock timeout in all cases? You'll have to define a $uniqueFor property. we can use that property to determine if the job should be unique or not. Also if we bring the interface back, I think we can make it optional. So people can use it only if they want that behaviour of using the job class name as the lock key. |
@taylorotwell, I think the interface is useful because of the following:
@themsaid I've kept a default value of zero for |
@paras-malhotra can you add the interface back for now. I don't know how to revert a commit on here. |
This reverts commit bc45489.
@taylorotwell reverted. |
@paras-malhotra @taylorotwell while i love this, and did something similar (albeit much messier 😅 in my own repo), there's an edge case problem with this implementation. job A can start processing, job B checked if should be dispatched and the answer is no, and only then job A finishes. this is especially a problem for a classic case of the "indexing" job (as used in the example of the PR) which can take a while to process. it will index the old data and the newer data will not be queued for reindex. |
I have similar usecase as @moshe-autoleadstar. Did I understood the implementation correctly, if the job with the unique id has already started to process, another job with the same id won't trigger? I am not use are about the queue internals, do the jobs have any status like "queued", "processing" and "complete"? In that case, I would like like to allow the "duplicate" job only if the status is !== "queued". |
I understand the use case but we need to consider what happens on an exception / timeout (causing a retry) or job release? If the lock is released before processing, we may have to re-acquire the lock on a retry/release but that would mean that between the window of releasing the lock and re-acquiring, there may be another job dispatch. Or alternatively, we could simply ignore everything if the job lock is already released. |
@paras-malhotra you are absolutely correct. either way this is done, there will be something not working "perfectly". but much better to process a job twice than to never process it at all. the risk of doing twice is a little more work for the system on some edge case, but the risk of never doing it is that things will go missing. better to err on the side of caution. |
@moshe-autoleadstar, @lasselehtinen I've submitted a PR for this: #35255 |
Is there a reason why the locking logic is within This would make more sens IMO, and additionally it would allow the feature to work independently from the |
@abellion if you need this feature then please use Laravel instead of Lumen. |
Yeah, sure. But if the locking logic can be moved in the |
@abellion the locking needs to happen at the time of dispatch (and not at the time of processing the queue jobs). That's why it's part of the |
@paras-malhotra Yep, I think moving it to https://github.com/laravel/framework/blob/8.x/src/Illuminate/Queue/Queue.php#L298 would work :) As I understand this is called just before the job is pushed into the queue. See https://gist.github.com/abellion/54fa7bda01204d458439cb8a67e2b326 |
Yeah @abellion, that gist looks like it should work. Ahh on second thought, it may need to be modified for the dispatch after commit feature to work properly. Perhaps the lock condition needs to be checked before the |
You're right @paras-malhotra, I moved it above ! |
Hey there -- that edit available in the version of Lumen currently available via Composer or is there some way to update it to grab the new condition? I've just started with Lumen and this solves my current road block! |
Is it possible to get a failure notice from that if the job could not be queued? e.g. That sends back a PendingDispatch object which as far as I can tell looks the same whether or not it actually got queued. |
Lock needs to use the string 'laravel_unique_job' or it doesn't unlock in CallQueuedHandler.
|
how this should be handled then ? |
Alternative to #35039 (based on @themsaid's feedback) with a different syntax. Both PRs are backwards-compatible with no breaking changes. I'm leaving both open so that one of them could be chosen based on preference. I do prefer this one over the other one.
Differences between both PRs:
UniqueJob
interface, whereas the other PR allows dispatching a unique job by chainingdispatch(...)->unique(...)
CallQueuedHandler
whereas the other PR releases the lock by registering a "after middleware" on the job.Motivation:
This PR allows the ability to dispatch unique jobs. So, if a unique job is already dispatched, it will not dispatch another job (with the same key) unless the job has completed processing. See laravel/ideas#2151.
Difference between this PR and
WithoutOverlapping
:The difference between
WithoutOverlapping
and dispatching unique jobs is that theWithoutOverlapping
middleware processes jobs uniquely (meaning two jobs with the same key will not be processed in parallel). However, it does not stop jobs (with the same key) from being enqueued if another such job is already in the queue waiting to be processed.Use Case:
This can be quite useful in scenarios such as search indexing. If a search index job is already enqueued for a resource, you probably don't want another such job to be queued until the previous job has finished processing. Sidekiq Enterprise offers the unique jobs feature as well.
How to Use
If you wish to make your Job a "unique job", just implement the
UniqueJob
interface like so:EDIT: Edited the above syntax to match the final PR that was merged.