This repository has been archived by the owner on Dec 13, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Join task callBackDuration #3594
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if someone has their systemTaskWorkerCallbackDuration to one second though? It will always still be 1 second, so this won't be an exponential backoff in that scenario. It feels like
defaultOffset
should be a configurable maximum offset or something like that. Maybe the default 30 seconds?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @james-deee , I think if we are setting systemTaskWorkerCallbackDuration to 1 second then ideally we don't want the exponential backoff thing. Here we are trying to exponential backoff from 1 to 30 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manan164 One hand, I see what you're saying, but in practice what we are seeing is that having 1 second doesn't alleviate the problem. The issue is that we have a lot of long running workflows (thus JOINs) in the queue. If we do NOT do an exponential backoff from our setting of 1 second, then our JOIN tasks don't get processed fast enough because the async task worker is trying to process say 10ks JOINs.......... which then makes JOINs that you want to process immediately be backed up and slowed down.
With my proposal, and with what we changed locally, fixes all of our issues with processing the async JOINS.
In our situation we use a duration of 1 second, but we have a cap of 30 seconds in place of that
defaultOffset
and it completely solves our issue. The result of it is that JOINs on fast running workflows get processed almost instantaneously, and JOINs in the long running workflows get pushed back to the max of 30 seconds..... which is what frees up the processing of the immediate ones.I really think these are 2 different levers/values that need to be present. Just using 1 second always won't work in heavy workfload Conductor instances where there might be a LOT of joins to process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe what you're saying is that we should be using a setting of 30 seconds for the
systemTaskWorkerCallbackDuration
always, and then it sounds like no one should ever use 1 second for thesystemTaskWorkerCallbackDuration
value.Then, I think that this could get confusing, because it would feel like someone would want to set the value to 1 for fast processing, but they wouldnt know that actually using 30 seconds will get the fastest/best processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manan164 maybe a compromise is to use a hardcoded 30 seconds for that
defaultOffset
. I really think it will be confusing and could cause issues for people to use 1 second for that setting then, because that doesn't actually give them the best performance.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @james-deee , The expectation here is when we use systemTaskWorkerCallbackDuration to 30 seconds ideally we are fine with tasks getting checked every 30 seconds and getting completed but if tasks are getting checked at 1,2,4,8,16 seconds and get completed that should also be fine. But when we use systemTaskWorkerCallbackDuration to 1 seconds which means I want my task to get checked every second. Hope this answers the question.