-
Notifications
You must be signed in to change notification settings - Fork 218
Use static assets for temporary control plane. #425
Conversation
rktbot run tests |
hack/quickstart/init-master.sh
Outdated
|
||
# Start etcd. | ||
if [ "$SELF_HOST_ETCD" = true ] ; then | ||
echo "WARNING: THIS IS NOT YET FULLY WORKING - merely here to make ongoing testing easier" | ||
etcd_render_flags="--etcd-servers=http://10.3.0.15:2379 --experimental-self-hosted-etcd" | ||
etcd_start_flags="--etcd-server=http://${COREOS_PRIVATE_IPV4}:12379 --experimental-self-hosted-etcd" | ||
etcd_render_flags="--etcd-servers=http://10.3.0.15:2379 --experimental-self-hosted-etcd" #TODO(diegs): maybe add --bootstrap-etcd-server=http://${COREOS_PRIVATE_IPV4}:12379 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note my TODO here: I've currently hardcoded the bootstrap etcd server to be at http://127.0.0.1:12379. If we need more flexibility we will need a flag to adjust for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think hard-code is fine for now. Only should be an issue if deployer wanted to use 12379 for the final etcd cluster - which would be odd.
} | ||
|
||
// Copy the static manifests to the kubelet's pod manifest path. | ||
manifestsDir := filepath.Join(assetDir, asset.AssetPathBootstrapManifests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a possible race here when bootstrapping etcd: if the apiserver tries to come up and etcd isn't up yet it will die. We could mitigate this by setting the restartPolicy on the bootstrap api server (and perhaps other pods too) or by waiting for etcd to be ready before bringing up the other pods (this was effectively the previous behavior).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might start with leaving the restart policy as default (always). If we don't need to provide coordination points (e.g. x starts before y) it keeps bootkube somewhat more simple -- and the api-server should just keep retrying until it can access etcd. This is a common pattern across kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, done.
rktbot run tests |
These changes are needed to make the tests pass after this bootkube change: kubernetes-retired/bootkube#425
Note: tests won't pass as-is due to the flag and quickstart changes. I have a separate patch for pluton (coreos/mantle#549) which allows the tests to pass. |
pkg/bootkube/bootkube.go
Outdated
} | ||
return err | ||
|
||
if err = CleanupBootstrapControlPlane(b.assetDir, b.podManifestPath); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved to the top of this function but using defer
This way assets get cleaned up even if bootkube has an error before reaching this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent point, done.
Overall LGTM, just one nit about how the cleanup routine is called. Thanks for digging into the test change required here. Since tests pass with coreos/mantle#549 locally, this PR should more or less be ready to go. Working on making that PR unnecessary as well. |
Local test results: diegs@x260:bootkube [static-manifests %]% ~/go/bin/pluton run --parallel 2 --basename diegs-bootkube-dev --platform=gce --gce-image=projects/coreos-cloud/global/images/coreos-stable-1298-7-0-v20170401 --bootkubeRepo=quay.io/coreos/bootkube-dev --bootkubeTag=57489f653e7fc7b865259648e723f1dc99024c79 bootkube.* The selfetcd.scale test has been flaky but I'll look into the logs to make sure it isn't anything troubling. |
@diegs: looks like another flake due to hyperkube downloads freezing: I suspect if you look at the journal logs for that test you'll find the worker machine taking forever to download hyperkube. That test hasn't made it past running bootkube and making sure the worker node is registered which means all 6 other tests ran the same code and got further. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need some minor changes - but overall this is great!
hack/quickstart/init-master.sh
Outdated
|
||
# Start etcd. | ||
if [ "$SELF_HOST_ETCD" = true ] ; then | ||
echo "WARNING: THIS IS NOT YET FULLY WORKING - merely here to make ongoing testing easier" | ||
etcd_render_flags="--etcd-servers=http://10.3.0.15:2379 --experimental-self-hosted-etcd" | ||
etcd_start_flags="--etcd-server=http://${COREOS_PRIVATE_IPV4}:12379 --experimental-self-hosted-etcd" | ||
etcd_render_flags="--etcd-servers=http://10.3.0.15:2379 --experimental-self-hosted-etcd" #TODO(diegs): maybe add --bootstrap-etcd-server=http://${COREOS_PRIVATE_IPV4}:12379 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think hard-code is fine for now. Only should be an issue if deployer wanted to use 12379 for the final etcd cluster - which would be odd.
pkg/asset/internal/templates.go
Outdated
hostNetwork: true | ||
containers: | ||
- name: kube-apiserver | ||
image: quay.io/coreos/hyperkube:v1.6.0_coreos.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump image to v1.6.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I also added the flock here based on other discussions to try to reduce the port contention churn. However, it often takes >30s for the self-hosted pods to come up, and definitely >30s to migrate etcd, and we don't tear down the temporary apiserver until all that is done. So, unless we increase the flock timeouts more we'll still see pod restarts for the self-hosted apiserver.
pkg/asset/internal/templates.go
Outdated
hostNetwork: true | ||
containers: | ||
- name: kube-controller-manager | ||
image: quay.io/coreos/hyperkube:v1.6.0_coreos.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump image to v1.6.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/asset/internal/templates.go
Outdated
hostNetwork: true | ||
containers: | ||
- name: kube-scheduler | ||
image: quay.io/coreos/hyperkube:v1.6.0_coreos.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump image to v1.6.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
hack/quickstart/init-master.sh
Outdated
|
||
# Start etcd. | ||
if [ "$SELF_HOST_ETCD" = true ] ; then | ||
echo "WARNING: THIS IS NOT YET FULLY WORKING - merely here to make ongoing testing easier" | ||
etcd_render_flags="--etcd-servers=http://10.3.0.15:2379 --experimental-self-hosted-etcd" | ||
etcd_start_flags="--etcd-server=http://${COREOS_PRIVATE_IPV4}:12379 --experimental-self-hosted-etcd" | ||
etcd_render_flags="--etcd-servers=http://10.3.0.15:2379 --experimental-self-hosted-etcd" #TODO(diegs): maybe add --bootstrap-etcd-server=http://${COREOS_PRIVATE_IPV4}:12379 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be able to also drop the --etcd-servers
flag altogether. 10.3.0.15 is the etcd service IP - which is already being inferred elsewhere from the etcd service asset (detectEtcdIP()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this is for the render side (not start) - so nothing to detect yet. But I do think we should probably just pick the service IP for the user if they're using self-hosted etcd. It has to be within the service-cidr, and we already assume certain IPs for other services (e.g. DNS gets the .10).
That change can be follow-up/optimization though (maybe just add a TODO if not addressed here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The render code actually already does this: https://github.com/kubernetes-incubator/bootkube/blob/master/cmd/bootkube/render.go#L163
I'll just clean up the unnecessary uses in the scripts.
pkg/bootkube/bootstrap.go
Outdated
return err | ||
} | ||
} | ||
if err := os.RemoveAll(asset.BootstrapSecretsDir); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @pbx0 about trying to clean up regardless of success -- but also specifically secrets - we should always try to remove these (even if steps above fail)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Switched the order to remove the secrets first. Overall the chaining of all this is a little brittle. I'll give some thought to a smarter way for handling the shared secrets.
return err | ||
} | ||
for _, secret := range secrets { | ||
if err := copyFile(filepath.Join(secretsDir, secret.Name()), filepath.Join(asset.BootstrapSecretsDir, secret.Name()), true); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving the secrets as world readable feels a bit wrong - but I'm not immediately sure what expectations we should set of how bootkube is executed and/or if the temp control plane should have a specific user (or privileged). Now that bootkube doesn't have to bind on protected port -- it doesn't really need special permissions, so I think we could scope its access down more.
Maybe we just add a TODO for now to re-asses the temporary secret handling later? I don't think this really should block us right now - as it is only first node on first boot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to assume that bootkube and the kubelet are always both running as root (seems to be the case in container linux)? If so I think we can lock this down to 0700.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dghubble do you know what our expectations are for how we're executing bootkube?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it looks like we do run it as root. If we keep that assumption we could change the permissions (0700 for dir 0600 for files)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I already verified that was true on the hack/ scripts, made the change, and ran it through the tests.
rktbot run tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last minor comment - but this looks great. Can you squash commits - then if tests are still flaky we can walk through some manual tests to at least get this merged
pkg/bootkube/bootstrap.go
Outdated
if err != nil { | ||
return err | ||
} | ||
return ioutil.WriteFile(dst, data, os.FileMode(0644)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just for secrets, can we switch to 0600
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also for manifests, but I think 0600 is good there too.
I just ran some manual smoke tests (also with self-hosted etcd). All looks good. |
Cool, I also ran it through manual tests (hack/multi-node) as well as invoking the modified pluton tests again. |
10c641e
to
ce97962
Compare
Ok, squashed and rebased. Running one more pass of tests now. |
All tests passed except for the etcd scale test. I think that's unrelated.
|
pkg/bootkube/bootstrap.go
Outdated
fi, err := os.Stat(dst) | ||
if fi != nil { | ||
return fmt.Errorf("file already exists: %v", dst) | ||
} else if !os.IsNotExist(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: else
is not required.
AssetPathBootstrapControllerManager = "bootstrap-manifests/bootstrap-controller-manager.yaml" | ||
AssetPathBootstrapScheduler = "bootstrap-manifests/bootstrap-scheduler.yaml" | ||
AssetPathBootstrapEtcd = "bootstrap-manifests/bootstrap-etcd.yaml" | ||
BootstrapSecretsDir = "/tmp/bootkube/secrets" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is in /tmp ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static manifests can only use filesystem mounts, not shared secrets. I need to put them somewhere temporarily--is there a more general place than /tmp to do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diegs My dumb question is why not just bootstrap-secrets/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, I think it doesn't matter, we will remove the secrets anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to merge when ready
This change: - Starts the bootstrap control plane using static manifests instead of launching them in-process. - Auto-detects whether self-hosted etcd is being used in the run phase, rather than requiring synchronized flag usage. - Simplifies a lot of code based on the above refactorings. After this change clients can completely customize their bootstrap environment, including changing the version of kubernetes that is launched.
These changes are needed to make the tests pass after this bootkube change: kubernetes-retired/bootkube#425
This change:
launching them in-process.
rather than requiring synchronized flag usage.
After this change clients can completely customize their bootstrap
environment, including changing the version of kubernetes that is
launched.
We might be able to un-vendor Kubernetes and pull in specific
dependencies after this change. I started to investigate but it
looks like it'd be a fairly big change to include here.
Fixes #168.