Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #648: delete the statefulset with precondition set to the correct… #651

Merged
merged 3 commits into from
Apr 9, 2021

Conversation

srteam2020
Copy link
Contributor

… UID

This closes #648

Note to reviewers: remember to look at the commits in this PR and consider if they can be squashed

Summary Of Changes

Ensure that the sts to be deleted has the correct uid (besides the name and namespace). A sts uid is a correct one if the sts's controller ref matches the current rabbitmqCluster's uid.

Copy link
Contributor

@ChunyiLyu ChunyiLyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time! Looks good overall. Please see my comments.

@@ -54,7 +58,7 @@ func (r *RabbitmqClusterReconciler) prepareForDeletion(ctx context.Context, rabb
// Delete StatefulSet immediately after changing pod labels to minimize risk of them respawning.
// There is a window where the StatefulSet could respawn Pods without the deletion label in this order.
// But we can't delete it before because the DownwardAPI doesn't update once a Pod enters Terminating.
if err := r.Client.Delete(ctx, sts); client.IgnoreNotFound(err) != nil {
if err := r.Client.Delete(ctx, sts, &client.DeleteOptions{Preconditions: &metav1.Preconditions{UID: &uid}}); client.IgnoreNotFound(err) != nil {
Copy link
Contributor

@ChunyiLyu ChunyiLyu Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest modifying the comment section above to explain why setting Preconditions for deletion is necessary?

Comment on lines 70 to 81
func (r *RabbitmqClusterReconciler) statefulSetUID(ctx context.Context, rmq *rabbitmqv1beta1.RabbitmqCluster) (types.UID, error) {
uid := types.UID("")
if sts, err := r.statefulSet(ctx, rmq); err == nil {
if ref := metav1.GetControllerOf(sts); ref != nil {
if string(rmq.GetUID()) == string(ref.UID) {
return sts.UID, nil
}
}
}
return uid, fmt.Errorf("failed to get the uid of the statefulset owned by the current rabbitmqCluster")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two suggestions: 1) I would add a line of comment to explain that the statefulSet uid is returned successfully only when the controller reference uid matches the rabbitmqCluster uid. 2) I think errors of failing to get the statefulSet, failing to get controller reference, and reference uid does not match rmq uid are all different failures. The function only returns one type of error at the moment, and it's unclear which step has failed. It's more verbose but the error msg is clearer:

Suggested change
func (r *RabbitmqClusterReconciler) statefulSetUID(ctx context.Context, rmq *rabbitmqv1beta1.RabbitmqCluster) (types.UID, error) {
uid := types.UID("")
if sts, err := r.statefulSet(ctx, rmq); err == nil {
if ref := metav1.GetControllerOf(sts); ref != nil {
if string(rmq.GetUID()) == string(ref.UID) {
return sts.UID, nil
}
}
}
return uid, fmt.Errorf("failed to get the uid of the statefulset owned by the current rabbitmqCluster")
}
func (r *RabbitmqClusterReconciler) statefulSetUID(ctx context.Context, rmq *rabbitmqv1beta1.RabbitmqCluster) (types.UID, error) {
var err error
var sts *appsv1.StatefulSet
var ref *metav1.OwnerReference
if sts, err = r.statefulSet(ctx, rmq); err != nil {
return "", fmt.Errorf("failed to get statefulSet: %s", err.Error())
}
if ref = metav1.GetControllerOf(sts); ref == nil {
return "", errors.New("failed to get controller reference for statefulSet")
}
if string(rmq.GetUID()) != string(ref.UID) {
return "", errors.New("statefulSet not owner by current RabbitmqCluster")
}
return sts.UID, nil
}

@srteam2020
Copy link
Contributor Author

@ChunyiLyu Thank you for the comments! I will try to address them today or tmr.

Copy link
Contributor

@ChunyiLyu ChunyiLyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking my suggestions onboard 👍

Copy link
Collaborator

@Zerpet Zerpet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for the contribution!

@ChunyiLyu ChunyiLyu merged commit 86bced0 into rabbitmq:main Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Reading stale RabbitmqCluster information can lead to undesired statefulset deletion
3 participants