-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robust rollingupdate and rollback #1353
Comments
The approach described above would seem to work for stateless services and/or blue-green deployments, but I'm not sure how it would work for something like e.g. a redis master where the pod itself has storage it cares about. Are stateful pods a use case we want to support? If so, they would seem to require some form of in-place update? |
I'd say yes with some sort of hook system to let people write custom migration scripts to support the transition say to the new Redis master. |
@alex-mohr See #598 and #1515. You could also put the data into a PD. |
The thing is we have to start somewhere and I would say we should start with a rolling update mechanism for stateless services only. Statefull services have a lot of different needs. I would try to avoid making k8s into a workflow integrating system. The question is. Would it be possible to let containers handle it themselves? On a shutdown lock and flush database. On restart unlock database etc. It could be handled on a per container basis, but makes them a lot more complex. database example: This would be the easiest example to start with. I agree there are other ways to remove the complexity from the containers themselves. This could be being able to execute a script on each old container via docker exec for example, which basically kills them when the script was executed. Either when all are killed the migration script could get in (downtime...) or the new containers could already be spawned. |
@bgrant0607 looks like we'd need a pre-prestop though as |
Copying the detailed design of rolling update from #3061: Requirements, principles, assumptions, etc.:
Proposed syntax:
If the number of replicas specified in the new template were unspecified, I think it would default to 0 after parsing. Behavior in this case would be that we gradually increase it to the replica count of the original. We could also allow the user to specify a new size, which would do a rolling update for min(old,new) and then delete or add the remaining replicas. Since kubectl can reason about the type of the json parameter, we could add support for specifying just the pod template later (like when we have config generation and v1beta3 sorted), or for other types of controllers (e.g., per-node controller). Regarding recovery in the case of failure in the middle (i.e., resumption of the rolling update), and rollback of a partial rolling update: This is where annotations would be useful -- to store the final number of replicas desired, for instance. I think the identical syntax as above could work for recovery, if kubectl first checked whether the new replicationController already existed and, if so, merely continued. If we also supported just specifying a name for the new replication controller, that could also be used either to finish the rollingupdate or to rollback, by swapping old and new on the command line. We should use an approach friendly to resizing, either via kubectl or via an auto-scaler. We should keep track of the number of replicas subtracted in an annotation on the original replication controller, so that the total desired is the current replica count plus the number subtracted. Unless the user specified the desired number explicitly in the new replication controller -- that can be stored in an annotation on the new replication controller. I also eventually want to support "roll over" -- replace N replication controllers with one. I think that's straightforward if we just use the convention that the file or last name corresponds to the target replication controller, though it may be easier for users to shoot themselves in the foot. Perhaps a specific command-line argument to identify the new/target controller would therefore be useful.
This issue should not be closed until we support recovery and rollover. |
I don't understand why you need annotations. Isn't the rolling update essentially stateless, in the sense that you can figure out where you left off and what remains to be done just by looking at the old and new replication controllers and the pods? |
Almost, but not quite. The 2 replication controllers are not changed atomically, so the count could be off by one without keeping track. It's also the case that we allow the size to be changed by the rolling update. |
How to make rolling update friendly to auto-scaling is described here: #2863 (comment) |
See also rollingupdate-related issues in the cli roadmap. |
cc: quinton-hoole |
I think the common scenarios are:
cc @rjnagal |
Rolling update, at the end perform a migration automatically (post deployment step).
|
@smarterclayton Migration meaning traffic shifting? |
No migration like schema upgrade (code version 2 rolled out, once code version 1 is gone, trigger the automatic DB schema update from schema 3->4) |
Ah, this is the post-deployment hook. Got it. |
Certainly doesn't have to be part of this, but if you think of deployment as a process of going from whatever the cluster previously had to something new, then many people may want to define the process (canaries, etc as you laid out). The process has to do a reasonable job of trying to converge, but it's acceptable to wedge and report being wedged due to unreconcilable differences (simple, like ports change, to complex, like image requires new security procedures). Since the assumption is that you're transforming between two steady states you either move forward, back, or stay stuck. ----- Original Message -----
|
I agree that hooks seem useful, certainly in the case where deployments are triggered automatically. If we were to defer triggers, hooks probably could also be deferred. |
Using #4140 for rollback/abort. |
Updating from cli-roadmap.md: Deployment (#1743) will need rollover (replace multiple replication controllers with one) in order to kick off new rollouts as soon as the pod template is updated. We should think about whether we still want annotations on the underlying replication controllers and, if so, whether they need to be improved: #2863 (comment) |
I believe we have more specific issues filed for remaining work, so I'm closing this "metaphysical" issue. |
@grant0607 is readiness probe and/or liveness probe taken into account of rolling update? What is the expected behavior when these probes are in good/bad conditions? |
In, PR #1325 we agreed rollingupdate should be factored out of kubecfg in the kubecfg overhaul.
#492 (comment) and PR #1007 discussed alternative approaches to rollingupdate.
What I'd recommend is that we implement a new rollingupdate library and a corresponding command-line wrapper. The library may be invoked by more general libraries/tools in the future.
The rolling update approach that should be used is that of creating a new replication controller with 1 replica, resizing the new (+1) and old (-1) controllers one by one, and then deleting the old controller once it reaches 0 replicas. Unlike the current approach, this predictably updates the set of pods regardless of unexpected failures.
The two replication controllers would need at least one differentiating label, which could use the image tag of the primary container of the pod, which is typically what motivates rolling updates.
It should be possible to apply the operation to services containing multiple release tracks, such as daily and weekly or canary and stable, by apply it to each track individually.
The text was updated successfully, but these errors were encountered: