-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAPI plugins optimization #5237
Conversation
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
flytestdlib/cache/auto_refresh.go
Outdated
@@ -363,6 +376,7 @@ func NewAutoRefreshBatchedCache(name string, createBatches CreateBatchesFunc, sy | |||
createBatchesCb: createBatches, | |||
syncCb: syncCb, | |||
lruMap: lruCache, | |||
Processing: newSyncSet(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
Processing: newSyncSet(), | |
processing: newSyncSet(), |
@@ -40,6 +40,11 @@ type ResourceWrapper struct { | |||
LogLinks []*flyteIdl.TaskLog | |||
} | |||
|
|||
// IsTerminal is used to avoid making network calls to the agent service if the resource is already in a terminal state. | |||
func (r ResourceWrapper) IsTerminal() bool { | |||
return r.Phase == flyteIdl.TaskExecution_SUCCEEDED || r.Phase == flyteIdl.TaskExecution_FAILED || r.Phase == flyteIdl.TaskExecution_ABORTED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add TIMED_OUT too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no timeout.
flyte/flyteidl/protos/flyteidl/core/execution.proto
Lines 44 to 56 in 97a79c0
message TaskExecution{ | |
enum Phase { | |
UNDEFINED = 0; | |
QUEUED = 1; | |
RUNNING = 2; | |
SUCCEEDED = 3; | |
ABORTED = 4; | |
FAILED = 5; | |
// To indicate cases where task is initializing, like: ErrImagePull, ContainerCreating, PodInitializing | |
INITIALIZING = 6; | |
// To address cases, where underlying resource is not available: Backoff error, Resource quota exceeded | |
WAITING_FOR_RESOURCES = 7; | |
} |
@@ -38,6 +38,8 @@ func launch(ctx context.Context, p webapi.AsyncPlugin, tCtx core.TaskExecutionCo | |||
// Store the created resource name, and update our state. | |||
state.ResourceMeta = rMeta | |||
state.Phase = PhaseResourcesCreated | |||
state.PhaseVersion = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was 2.
return state, core.PhaseInfoQueued(time.Now(), 2, "launched"), nil |
because we set it to 0 when we allocate a token.
}, core.PhaseInfoQueued(a.clock.Now(), 0, "Allocation token required"), nil |
Signed-off-by: Kevin Su <[email protected]>
Tracking issue
NA
Why are the changes needed?
There are some issues in the webapi plugin:
Queue
after the task succeeds since the plugin returns PhaseInfoQueued when an item is being deleted from the cache.What changes were proposed in this pull request?
IsTerminal
to thewebapi.Resource
for early termination of the processed item.PhaseVersion
tocache.state
. If the cache is missed, propeller will return the previous phase and version and wait for the item to be processed by the async cache.How was this patch tested?
Local/remote cluster
Setup process
Screenshots
Before:
~5 mins / per workflow
~30s for a single task
After:
~3 mins / per workflow
~15s for a single task
Check all the applicable boxes
Related PRs
NA
Docs link
NA