You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When deploying a model, if it happens quickly and the initial response is COMPLETED, the abstract retryable step never executes the retry task.
The future is never completed (although the model deploys successfully) and eventually results in a timeout message for that step even though the model successfully deployed.
How can one reproduce the bug?
Deploy a model on a cluster where it happens fast enough that the initial response is COMPLETED.
Probably the source of this test failure: #465 (comment)
What is the expected behavior?
Successful model deployment should check the deploy task at least once.
What is your host/environment?
FGAC-enabled domain.
Do you have any screenshots?
[2024-02-04T23:47:44,028][INFO ][o.o.f.w.ProcessNode ] [fb9bced73eac9fc85c53cd87ca0e8665] Starting deploy_model_3.
[2024-02-04T23:47:44,029][INFO ][o.o.m.a.d.TransportDeployModelAction] [fb9bced73eac9fc85c53cd87ca0e8665] Will deploy model on these nodes: xgPngcVmTBailSli5PbIww
[2024-02-04T23:47:44,052][INFO ][o.o.f.w.DeployModelStep ] [fb9bced73eac9fc85c53cd87ca0e8665] Model deployment state COMPLETED
Do you have any additional context?
The while (!future.isDone()) { } loop should be a do { } while().
The text was updated successfully, but these errors were encountered:
What is the bug?
When deploying a model, if it happens quickly and the initial response is COMPLETED, the abstract retryable step never executes the retry task.
The future is never completed (although the model deploys successfully) and eventually results in a timeout message for that step even though the model successfully deployed.
How can one reproduce the bug?
Deploy a model on a cluster where it happens fast enough that the initial response is COMPLETED.
Probably the source of this test failure:
#465 (comment)
What is the expected behavior?
Successful model deployment should check the deploy task at least once.
What is your host/environment?
FGAC-enabled domain.
Do you have any screenshots?
Do you have any additional context?
The
while (!future.isDone()) { }
loop should be ado { } while()
.The text was updated successfully, but these errors were encountered: