Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: drain add fallback to remove pods with grace period 0 #43

Merged
merged 5 commits into from
Oct 3, 2022

Conversation

aldor007
Copy link
Member

No description provided.

@@ -14,11 +14,11 @@ require (
go.uber.org/goleak v1.1.12
golang.org/x/sync v0.0.0-20220601150217-0de741cfad7f
helm.sh/helm/v3 v3.9.2
k8s.io/api v0.24.2
k8s.io/apimachinery v0.24.2
k8s.io/api v0.25.2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contains fix for testing lib kubernetes/kubernetes#110425

@aldor007 aldor007 marked this pull request as ready for review September 28, 2022 11:42
@aldor007 aldor007 requested a review from a team September 28, 2022 11:44
@@ -54,7 +54,7 @@ func (h *drainNodeHandler) Handle(ctx context.Context, data interface{}) error {
return fmt.Errorf("unexpected type %T for delete drain handler", data)
}

log := h.log.WithField("node_name", req.NodeName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a drain handler, please consider removing 'delete' from the logging message above.

return fmt.Errorf("sending evict pods requests: %w", err)
}

return h.waitNodePodsTerminated(ctx, node)
}

func (h *drainNodeHandler) deleteNodePods(ctx context.Context, log logrus.FieldLogger, node *v1.Node) error {
func (h *drainNodeHandler) deleteNodePods(ctx context.Context, log logrus.FieldLogger, options metav1.DeleteOptions, node *v1.Node) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a common practice to have options as the last parameter in a signature.

return fmt.Errorf("sending evict pods requests: %w", err)
}

return h.waitNodePodsTerminated(ctx, node)
}

func (h *drainNodeHandler) deleteNodePods(ctx context.Context, log logrus.FieldLogger, node *v1.Node) error {
func (h *drainNodeHandler) deleteNodePods(ctx context.Context, log logrus.FieldLogger, options metav1.DeleteOptions, node *v1.Node) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder have you considered using functional options pattern? https://golang.cafe/blog/golang-functional-options-pattern.html

@@ -195,7 +210,7 @@ func (h *drainNodeHandler) waitNodePodsTerminated(ctx context.Context, node *v1.

// evictPod from the k8s node. Error handling is based on eviction api documentation:
// https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/#the-eviction-api
func (h *drainNodeHandler) evictPod(ctx context.Context, pod v1.Pod) error {
func (h *drainNodeHandler) evictPod(ctx context.Context, options metav1.DeleteOptions, pod v1.Pod) error {
Copy link
Contributor

@varnastadeus varnastadeus Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evict doesn't need DeleteOptions so I would suggest keeping method signature as it was
and while deleting pods do:
sendPodsRequests doesn't need it too

fn := func(ctx context.Context, pod v1.Pod) error {
	return h.deletePod(ctx, options, pod)
}

if err := h.sendPodsRequests(ctx, pods, fn); err != nil {
	return fmt.Errorf("sending delete pods requests: %w", err)
}

Comment on lines 87 to 102
deleteCtx, deleteCancel := context.WithTimeout(ctx, h.cfg.podsDeleteTimeout)
defer deleteCancel()
if err := h.deleteNodePods(deleteCtx, log, node); err != nil {
err = h.deleteNodePods(deleteCtx, log, node, metav1.DeleteOptions{})
if err == nil {
log.Info("node drained")
return nil
}
if !errors.Is(err, context.DeadlineExceeded) {
return fmt.Errorf("forcefully deleting pods: %w", err)
}

deleteForceCtx, deleteForceCancel := context.WithTimeout(ctx, h.cfg.podsDeleteTimeout)
defer deleteForceCancel()
if err = h.deleteNodePods(deleteForceCtx, log, node, *metav1.NewDeleteOptions(0)); err != nil {
return fmt.Errorf("deleting gracePeriod=0 pods: %w", err)
}
Copy link
Contributor

@varnastadeus varnastadeus Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion, maybe this will be more clear: 🤔

// Try deleteing pods gracefully first, then delete with 0 grace period.
options := []metav1.DeleteOptions{
	metav1.DeleteOptions{},
	*metav1.NewDeleteOptions(0)
}

var err error
for _, o := options {
	deleteCtx, deleteCancel := context.WithTimeout(ctx, h.cfg.podsDeleteTimeout)
	defer deleteCancel()

	err = h.deleteNodePods(deleteCtx, log, node, o);
	if err == nil {
		return nil
	}
	if !errors.Is(err, context.DeadlineExceeded) {
		return fmt.Errorf("forcefully deleting pods: %w", err)
	}
}

return err

@aldor007 aldor007 requested review from varnastadeus, andrejatcastai and a team September 30, 2022 13:58
@aldor007 aldor007 merged commit fbeae32 into main Oct 3, 2022
@aldor007 aldor007 deleted the feat/grace-period-0 branch October 3, 2022 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants