You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After restarting from crash and connecting to a stale apiserver, the operator can mistakenly delete the tserverUI service if it reads the stale state cluster.Spec.Tserver.TserverUIPort.
Consider the following situation, there are two apiservers, apiserver1 and apiserver2, and the operator initially is communicating with apiserver1. The field cluster.Spec.Tserver.TserverUIPort is initially set to -1, so there is no tserverUI service running. Then the user changes the cluster.Spec.Tserver.TserverUIPort to a valid port number. The operator creates the tserverUI service accordingly. After the tserverUI service is created, the operator crashes, restarts, and starts to communicate with apiserver2. The apiserver2 is stale and still holds the cluster.Spec.Tserver.TserverUIPort field as -1 at the moment. The operator cannot differentiate whether the data is stale and directly deletes the tserverUI service.
To reproduce
Create YBCluster with cluster.Spec.Tserver.TserverUIPort set to -1.
Change cluster.Spec.Tserver.TserverUIPort to 7000. Operator will create the tserverUI service. Meanwhile, apiserver2 is straggling and still holds cluster.Spec.Tserver.TserverUIPort as -1.
Operator crashes, restarts, and communicates with apiserver2. It then reconciles and deletes the tserverUI service since cluster.Spec.Tserver.TserverUIPort is -1 on apiserver2.
Additional information
This bug is similar to #35. We remove the min value constraint for TserverUIPort in the CRD and find this problem.
Fix
We are willing to send a PR to fix this problem.
A potential fix is to use UID in (precondition) when deleting the service.
The text was updated successfully, but these errors were encountered:
Describe the bug
After restarting from crash and connecting to a stale apiserver, the operator can mistakenly delete the tserverUI service if it reads the stale state
cluster.Spec.Tserver.TserverUIPort
.Consider the following situation, there are two apiservers, apiserver1 and apiserver2, and the operator initially is communicating with apiserver1. The field
cluster.Spec.Tserver.TserverUIPort
is initially set to -1, so there is no tserverUI service running. Then the user changes thecluster.Spec.Tserver.TserverUIPort
to a valid port number. The operator creates the tserverUI service accordingly. After the tserverUI service is created, the operator crashes, restarts, and starts to communicate with apiserver2. The apiserver2 is stale and still holds thecluster.Spec.Tserver.TserverUIPort
field as -1 at the moment. The operator cannot differentiate whether the data is stale and directly deletes the tserverUI service.To reproduce
cluster.Spec.Tserver.TserverUIPort
set to -1.cluster.Spec.Tserver.TserverUIPort
to 7000. Operator will create the tserverUI service. Meanwhile, apiserver2 is straggling and still holdscluster.Spec.Tserver.TserverUIPort
as -1.cluster.Spec.Tserver.TserverUIPort
is -1 on apiserver2.Additional information
This bug is similar to #35. We remove the min value constraint for
TserverUIPort
in the CRD and find this problem.Fix
We are willing to send a PR to fix this problem.
A potential fix is to use UID in (
precondition
) when deleting the service.The text was updated successfully, but these errors were encountered: