You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the bug?
During disaster recovery, all the nodes go down and need to restart. After all the nodes starts and join the cluster, models are failed to be auto-redeployed
How can one reproduce the bug?
Steps to reproduce the behavior:
Stop all the nodes
Start all the nodes at the same time
All the models that were deployed are not able to be redeployed
What is the expected behavior?
Models should re re-deploy
Root cause
Model auto-deployment is triggered when nodes join the cluster but during that time the model index are not available to be queried
Proposal
Instead of query the models from model index immediately, we can wait for the model config health turn to yellow
What is your host/environment?
OS: Linux
Version [e.g. 22]
Plugins
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered:
@khoaisohd Is the error ClusterBlockException? Also, deploying the model automatically can be complicated in this case(cluster status not ready and query runs), this is an issue to solve this case: #1148 model automatically when inferencing), please take a look on this issue and comment if you have more concern.
What is the bug?
During disaster recovery, all the nodes go down and need to restart. After all the nodes starts and join the cluster, models are failed to be auto-redeployed
How can one reproduce the bug?
Steps to reproduce the behavior:
What is the expected behavior?
Models should re re-deploy
Root cause
Model auto-deployment is triggered when nodes join the cluster but during that time the model index are not available to be queried
Proposal
Instead of query the models from model index immediately, we can wait for the model config health turn to yellow
What is your host/environment?
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered: