Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Kubelet Config file for bootstrapping #681

Closed
garvinp-stripe opened this issue Aug 20, 2023 · 23 comments
Closed

Supporting Kubelet Config file for bootstrapping #681

garvinp-stripe opened this issue Aug 20, 2023 · 23 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@garvinp-stripe
Copy link
Contributor

garvinp-stripe commented Aug 20, 2023

Description

What problem are you trying to solve?
Currently all AMI family, minus custom, expects /etc/eks/bootstrap.sh on the AMI. Is there anyway to open it up in the future for specifying other ways for passing taint, labels and kubelet configurations down to the userdata?
Kubelet configuration, which is already used to pass kubelet configuration information around for UserData generation within Karpenter (https://github.com/aws/karpenter/blob/main/pkg/providers/amifamily/resolver.go#L77) and is supported already by eks bootstrap script.

With Karpenter already able to merge user data for most ami families, it should be able to open up what AMI users can use. One thing that may need to change is the ordering of merging UserData. I believe if Karpenter is able to generate kubelet config then I believe the customer's user data likely need to go after Karpenters for custom AMIs, https://github.com/aws/karpenter/blob/main/pkg/providers/amifamily/bootstrap/eksbootstrap.go#L45, since the customer user data may be the one launch kubelet. I also think labels specifically isn't passed through kubelet-config so that either needs to be dropped or pass through via env var which the customer's userdata then can use to launch kubelet.

How important is this feature to you?
Currently we are stuck with using custom ami family which meant all provisioner configurations are not usable. This will greatly help us get closer to using all of Karpenter's features

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@garvinp-stripe garvinp-stripe added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 20, 2023
@bwagner5
Copy link
Contributor

The AMIFamily construct should be extensible enough to support this. Is this something you'd be open to prototyping? I'd be interested to see how this would work with your AMI.

Another idea we've thrown around is exposing some of the parameters as env vars that you could use in a custom AMI.

@ellistarn
Copy link
Contributor

As an alternative to env vars, we could explore dropping a node.yaml file (i.e. v1beta1.KubeletConfiguration, labels, taints, etc) into a well known path on the node. This would essentially be our "runtime API" and could be consumable by any ami implementation. Maybe we could use cloud inits "write files" directive?

We could modify EKS optimized ami to be the first consumer of this new API as part of bootstrap.

This approach wouldn't work for operating systems that don't support file writing (e.g. bottlerocket), but I think it's a reasonable tradeoff, and we can special case amifamilies as necessary.

@garvinp-stripe
Copy link
Contributor Author

Ya I have been trying to map how we setup launching our kubelet vs how EKS bootstrap does it. Will report back once I have more data/ information.

@bwagner5
Copy link
Contributor

@garvinp-stripe would love to collaborate on a new approach. I think the EKS Bootstrap script is pretty clunky in the API we offer (just CLI args), so coming up with something net-new is probably the right approach and then we could adapt the EKS Bootstrap script to use it or write a translation layer.

I like the node.yaml idea that Ellis mentioned above. That should give us a nice way to add kubelet configuration + any additional arguments that kubelet or containerd accepts.

@garvinp-stripe
Copy link
Contributor Author

Ya would love to collaborate on the new approach

@garvinp-stripe
Copy link
Contributor Author

Here is the kubelet configurations we use from what I can tell. Not all of them needs to be flags, some could be moved to kubelet-config file.

--logtostderr (being deprecated)
--root-dir
--config (for kubelet-config)
–container-runtime=remote
--kubeconfig (define how to connect to cluster)
--container-runtime-endpoint=
--node-labels
–hostname-override
--provider-id
--enable-server
--pod-infra-container-image (for our own ECR)

No flags for containerd, just containerd config toml.

@rothgar
Copy link

rothgar commented Aug 29, 2023

I'm not sure I like the file drop approach. It would depend on when the file is dropped (before or after user-data) and would probably need more files to be dropped (containerd config).

As pointed out it doesn't support AMIs that have read only file systems which include bottlerocket, talos, core os, and a couple others.

@garvinp-stripe
Copy link
Contributor Author

When you say before/after user data what do you mean? My thought was actually within user data where Karpenter writes a kubelet-config and other configs after which EKS bootstrap would just run with those files (maybe with arguments specifying the file path)

@jonathan-innis
Copy link
Member

It would depend on when the file is dropped

You'd probably want to drop this as soon as possible, so ideally at the beginning of the userData process so that the user has the opportunity to work with the data and then pass the data that they need into the different systems that need to start for kubelet to run successfully.

doesn't support AMIs that have read only file systems

This is an interesting concern. Is the entire file system readonly or is there a user-space that can be used to drop the file and allow manipulation before passing the final configuration to the kubelet?

@rothgar
Copy link

rothgar commented Aug 30, 2023

A dropped file still doesn't solve problems like changing kubelet flags or setting KUBELET_EXTRA_ARGS

You'd probably want to drop this as soon as possible

That depends because things like bootstrap.sh change the kubelet config which can cause problems. IIRC they usually read from a template file and write a kubelet.config.json file which would overwrite a dropped file.

Is the entire file system readonly

All the one's I've used have some writeable space but not all of them use that space for configuration or even have shell tools that would allow you to write a file.

talos has /system but files get wiped on each boot. They have patches that use an overlayfs for configuration.

Bottlerocket is tempfs writeable /etc

Other distros (eg fedora coreos) are moving to systemd sysext for extensions and configuration.

@garvinp-stripe
Copy link
Contributor Author

A dropped file still doesn't solve problems like changing kubelet flags or setting KUBELET_EXTRA_ARGS
That isn't entirely true? From what I can see in EKS bootstrap script. These are passed as env var via systemd https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh#L628 or is that not what I think it is doing.

Does the eks bootstrap script work for hosts that don't have a writable disk? How does eks bootstrap script solve that?

This might be naive /wrong but there will always be specialize AMIs that doesn't let writes happen and needs a different kind of input. Is it possible to categories these AMIs and support them differently?

@rothgar
Copy link

rothgar commented Sep 12, 2023

That isn't entirely true? From what I can see in EKS bootstrap script. These are passed as env var via systemd https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh#L628 or is that not what I think it is doing.

The problem is karpenter's userdata doesn't call the bootstrap script. The karpenter provisioner user data is prepended to the user data and the bootstrap script is called later which will override anything you try to add.

@garvinp-stripe
Copy link
Contributor Author

Right however is it possible to augment the bootstrap script to take in a file for KUBELET_EXTRA_ARGS or use existing parameter?

@njtran njtran transferred this issue from aws/karpenter-provider-aws Nov 2, 2023
@garvinp-stripe
Copy link
Contributor Author

I think this issue should be in karpenter-provider-aws

@garvinp-stripe
Copy link
Contributor Author

Is there any appetite to updating custom AMIfamily with additional parameter for defining how to pass through taints and labels, File and/or EnvVar, so that we can passthrough instead of drop taints and labels? This means we don't assume bootstrap script needs to exist but allow users to define their own way to consume these via a file or envar?

@njtran can you move this back to Karpenter-provider-aws?

@jonathan-innis
Copy link
Member

jonathan-innis commented Jan 26, 2024

can you move this back to Karpenter-provider-aws

Yeah, it looks like this was erroneously moved and now we are kinda stuck with it over here. There's unfortunately no way to move issues between organizations and now that we are in kubernetes-sigs, we can't move it back. I'm fine to keep this over here right now since the history of it is so long and I'd like to not lose the context.

for defining how to pass through taints and labels, File and/or EnvVar

Would a helm-style syntax work for you? That's what I'd personally be most in favor of

echo {{ .clusterDNS }}
echo Labels: 
{{- range $key, $value := .labels }}
  {{ $key }}={{ $value}},
{{- end }}

@jonathan-innis
Copy link
Member

Realistically, someone should just probably pick up where aws/karpenter-provider-aws#4504 left off since it staled out.

@garvinp-stripe
Copy link
Contributor Author

garvinp-stripe commented Jan 26, 2024

Yeah, it looks like this was erroneously moved and now we are kinda stuck with it over here.

😞 I totally forgot the org changed. Agreed we should keep it here now

Would a helm-style syntax work for you? That's what I'd personally be most in favor of

Ya that sounds fine to me.

Realistically, someone should just probably pick up where aws/karpenter-provider-aws#4504 left off since it staled out.

That definitely looks like the right place to restart. Let me re-read through that PR. I think there was some discuss we wanted to address.

@njtran
Copy link
Contributor

njtran commented Feb 7, 2024

Sorry about the erroneous move! @garvinp-stripe if you want to recreate this in AWS and reference this, if that's how you prefer to track this, feel free to.

@garvinp-stripe
Copy link
Contributor Author

No worries I am fine with keeping it here.

@jonathan-innis
Copy link
Member

@garvinp-stripe With aws/karpenter-provider-aws#5833 and aws/karpenter-provider-aws#5134, do you think that you still need this issue open?

@garvinp-stripe
Copy link
Contributor Author

I think that should work we can close this

@garvinp-stripe
Copy link
Contributor Author

Closing against the two issues @jonathan-innis mentioned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

6 participants