You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems to be a problem with graceful termination handling in the situation when the helm-controller workers are busy reconciling a release that takes longer than 30s (readinessProbe.failureThreshold: 3 and readinessProbe.periodSeconds: 10 as configured by default) the readiness probe fails immediately after SIGTERM (I see that in the pod events) and then the container receives another SIGTERM which triggers the signal handler to exit immediately with code 1).
However, this doesn't look like something we can fix in helm-controller itself but rather an issue in the controller-runtime logic:
Removing the readiness probe will solve only part of the problem because we also need to override the default controller manager gracefulShutdownTimeout (30s).
The text was updated successfully, but these errors were encountered:
Overriding the default GracefulShutdownTimeout option given to the
controller manager with a default of 0 (no timeout) since the helm
operations are sensitive to interruption and can lead to leaving the
HelmRelease in a bad state.
This will also allow users to override the option via a cli flag
`-graceful-shutdown-timeout` how much time to wait before forcibly
exiting.
Related to #569
Signed-off-by: Aurel Canciu <[email protected]>
There seems to be a problem with graceful termination handling in the situation when the helm-controller workers are busy reconciling a release that takes longer than 30s (
readinessProbe.failureThreshold: 3
andreadinessProbe.periodSeconds: 10
as configured by default) the readiness probe fails immediately after SIGTERM (I see that in the pod events) and then the container receives another SIGTERM which triggers the signal handler to exit immediately with code 1).However, this doesn't look like something we can fix in helm-controller itself but rather an issue in the controller-runtime logic:
internalProceduresStop
channel is closed here (before the runnables are stopped): https://github.com/kubernetes-sigs/controller-runtime/blob/v0.13.1/pkg/manager/internal.go#L539Removing the readiness probe will solve only part of the problem because we also need to override the default controller manager gracefulShutdownTimeout (30s).
The text was updated successfully, but these errors were encountered: