-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore 404 when deleting defunct K8S pod #9194
Ignore 404 when deleting defunct K8S pod #9194
Conversation
catch (Exception exception) | ||
// Ignore NotFound errors, as the pod may have already been deleted by other means | ||
when (exception is not HttpOperationException { Response.StatusCode: HttpStatusCode.NotFound }) | ||
{ | ||
_logger.LogError(exception, "Error deleting pod {PodName} in namespace {PodNamespace} corresponding to defunct silo {SiloAddress}", change.Name, _podNamespace, change.SiloAddress); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch (Exception exception) | |
// Ignore NotFound errors, as the pod may have already been deleted by other means | |
when (exception is not HttpOperationException { Response.StatusCode: HttpStatusCode.NotFound }) | |
{ | |
_logger.LogError(exception, "Error deleting pod {PodName} in namespace {PodNamespace} corresponding to defunct silo {SiloAddress}", change.Name, _podNamespace, change.SiloAddress); | |
} | |
catch (Exception exception) | |
{ | |
// Ignore NotFound errors, as the pod may have already been deleted by other means | |
if (exception is not HttpOperationException { Response.StatusCode: HttpStatusCode.NotFound }) | |
{ | |
_logger.LogError(exception, "Error deleting pod {PodName} in namespace {PodNamespace} corresponding to defunct silo {SiloAddress}", change.Name, _podNamespace, change.SiloAddress); | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could add a second catch clause to ignore those errors. I prefer catch + if, though.
7697e6a
to
9ddb89e
Compare
The tests failed, but it looks like it was an unstable/transient one... maybe a re-run would "fix" it? |
@tomachristian you're correct - that test relies a bit on luck. We need to update it to make its own luck (steer grain placement rather than hope + retry until it happens the way it wants) |
The subsequent failure was due to an issue in the build pipeline which I need to fix - it wasn't allowed to overwrite a build artifact. We should include the attempt number in the artifact name. |
Microsoft Reviewers: Open in CodeFlow
We started getting these exceptions after we enabled
DeleteDefunctSiloPods
:Operation returned an invalid status code 'NotFound', response body {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"xxxxx-fc9dd7f96-gvvt2\" not found","reason":"NotFound","details":{"name":"xxxxx-fc9dd7f96-gvvt2","kind":"pods"},"code":404}
These cases are actually common, as K8S might have already killed/deleted the pod, so when we get a NotFound exception then we can ignore it and not log the exception.