-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] [request]: API flag to initialize completely bare EKS cluster #923
Comments
To improve the workaround, it should be possible to use For some reason, that didn't actually make it into the release notes, or the docs. Future work will automate this further in Helm, so they might be waiting to document it with that. |
@sc250024 do you have an example of your workflow for capturing CoreDNS and kube-proxy as Helm charts? We already do this for the aws-vpc-cni (we use the remote yaml referenced in the upgrade guide to delete this). |
Actually we don't do that currently; we're using the In general, we automate a lot of our provisioning, and right now, we have to do a lot of hacks to either apply something over an existing resource, or patch an existing resource. It's really just running |
@sc250024 it sounds like we've got very similar requirements. Currently we have automated kube-proxy and CoreDNS version patching via Terrafrom and when we bootstrap a cluster we remove the aws-vpc-cni installed and replace it with the helm chart. My highest priority would be to delete the default CoreDNS and capture that with a helm chart. |
I was curious, and had a play with the CoreDNS Helm chart to see how close I could get to generating the existing AWS deployment of CoreDNS. It's not far off, but it highlights a few differences:
(There's more details and less-impactful differences in the comments on the YAML) So you could use the values.yaml attached (updating Of course, deleting the Helpfully, every object in the AWS yaml is labelled with A couple of the above issues (and other things called out in the text) are possibly bug-reports or feature-requests to be raised with CoreDNS. Note that these are not recommended settings. They are mirroring the existing AWS YAML as closely as possible, including possible feature regressions, e.g., rollback to CoreDNS 1.7.0, disabling lameduck and ttl in the service setup. On the other hand, some are important, like limiting the Deployment to 64-bit Linux hosts, and EC2 (i.e. not Fargate). Unless you want CoreDNS on Fargate of course. Then it's a regression. ^_^ A values.yaml describing the differences# Contrasting AWS CoreDNS 1.7.0 install from https://docs.aws.amazon.com/eks/latest/userguide/coredns.html
# curl -o dns.yaml https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/dns.yaml
# VS the current CoreDNS 1.8.0 Helm chart
# helm repo add coredns https://coredns.github.io/helm
# helm repo update
# helm template coredns coredns/coredns --namespace kube-system --values aws.coredns.values.yaml
# (This file is aws.coredns.values.yaml)
## Differences I could not capture:
# AWS's Service is named kube-dns, CoreDNS creates one named coredns
# The ClusterRole and ClusterRoleBinding in AWS's YAMLare Default and named system:coredns,
# with Auto-reconciliation disabled, see
# https://kubernetes.io/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings
# and had the following extra rule, I'm not sure why.
# - apiGroups:
# - ""
# resources:
# - nodes
# verbs:
# - get
#
# This might be something that AWS have patched into their CoreDNS binary's kubernetes plugin,
# i.e. similar to the one propsed at https://github.com/coredns/coredns/issues/3077
# which was eventually punted as a different plugin and abandoned.
#
# CoreDNS Helm chart names its ClusterRole/Binding simply 'coredns' (i.e. fullNameOverride) and they are labelled as
# kubernetes.io/cluster-service: true
# instead.
# The Prometheus metrics have a separate Service in the Helm chart, but are scraped
# from the main Service in the CoreDNS chart
# That said, the CoreDNS chart doesn't seem to have a containerPort exposed for them. Bug in the Helm chart?
# AWS's Pod has the following that CoreDNS Helm chart doesn't support
# securityContext:
# allowPrivilegeEscalation: false
# capabilities:
# add:
# - NET_BIND_SERVICE
# drop:
# - all
# readOnlyRootFilesystem: true
# CoreDNS Helm chart has the following annotations (old name for priorityClassName and tolerations respectively)
# when isClusterService is set.
# Goodness, these are old, and someone should fix the CoreDNS chart, as they are no longer effective in current k8s.
# scheduler.alpha.kubernetes.io/critical-pod: ''
# scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'
# AWS Pod mounts the config-volume read-only.
# Helm chart distinguishes readiness probe from health probe. (More-modern approach)
# Helm chart specifies a maxSurge (25%) for the Deployment's rollingUpdate.
# Various minor diferences:
# - Labels and annotations
# - The container port names are different
# - Generated Helm chart doesn't have namespace metadata, because Helm takes care of that.
fullnameOverride: coredns
serviceAccount:
create: true
priorityClassName: system-cluster-critical
replicaCount: 2
image:
repository: 602401143452.dkr.ecr.REGION.amazonaws.com/eks/coredns
tag: v1.7.0-eksbuild.1
podAnnotations:
eks.amazonaws.com/compute-type: ec2
service:
clusterIP: DNS_CLUSTER_IP
extraVolumes:
- name: tmp
emptyDir: {}
extraVolumeMounts:
- name: tmp
mountPath: /tmp
terminationGracePeriodSeconds: 0
resources:
limits:
cpu: null
memory: 170Mi
requests:
memory: 70Mi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "beta.kubernetes.io/os"
operator: In
values:
- linux
- key: "beta.kubernetes.io/arch"
operator: In
values:
- amd64
- arm64
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- coredns
topologyKey: kubernetes.io/hostname
weight: 100
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: "CriticalAddonsOnly"
operator: "Exists"
prometheus:
service:
enabled: true
# Because of the way Helm works, you cannot override parts of this
# array, so the whole thing is copied out of the coredns/coredns
# defaults (without comments), and the differences with AWS noted.
servers:
- zones:
- zone: .
port: 53
plugins:
- name: errors
- name: health
# AWS doesn't have this
#configBlock: |-
# lameduck 5s
# AWS doesn't use this plugin at all, but it's needed elsewhere in the chart
- name: ready
- name: kubernetes
parameters: cluster.local in-addr.arpa ip6.arpa
configBlock: |-
pods insecure
fallthrough in-addr.arpa ip6.arpa
# AWS doesn't have this
# ttl 30
- name: prometheus
# parameters: 0.0.0.0:9153
# AWS uses the below, I guess that means we're IPv6-ready? *cough*
parameters: :9153
- name: forward
parameters: . /etc/resolv.conf
- name: cache
parameters: 30
- name: loop
- name: reload
- name: loadbalance
|
@TBBle that's a great summary of the differences. I think the next step would be to open up a PR on the CoreDNS chart to close the gap and allow al of the AWS settings to be set correctly. As this issue is about providing a bare EKS cluster the potential downtime is probably not an issue, until a bare cluster is an option we remove the unwanted addons before the cluster has any node to run them on. |
I should point out that I haven't tested this. It was done using That said, I probably would not try and replicate some of the AWS differences, like One thing to keep in mind is that perhaps it's important that the Service be named So it might be worth proposing that the CoreDNS Helm chart specifically be able to override the Service name separately from the existing fullname used to name the objects. Or just install the chart as |
@stevehipwell said what I was going to say, which is that the main point was to raise the question for AWS about whether or not they can support this feature. But to @TBBle , appreciate the help with the Helm chart values 😊 To me, it's either one of two things:
Right now, it's in an awkward in-between state in my opinion. They're trying to provide the base cluster components (which makes sense), but stumble a bit with the upgrade path when the control plane is upgraded. |
The AWS-managed add-ons approach shipped last month, albeit not many add-ons yet, just aws-node. #252 (comment) That same ticket did confirm that "bare cluster" is also on the roadmap. I suspect it'll come implicitly once the reamining existing YAML add-ons are all migrated to EKS Add-ons, i.e., #1159. |
Hi all, This feature is in our development plans and I've added it to our public roadmap. We envision that in time, all EKS clusters will use managed add-ons and we will not boot components into clusters that are not managed by EKS and you cannot control via the EKS APIs. Our 3 core add-ons (VPC CNI, coredns, kube-proxy) will still be enabled by default, but you can optionally elect to have them not be installed when you create the cluster. |
Much appreciated @tabern. Thank you! |
@tabern where are we with this after today's announcement? |
I have a hacky workaround: |
@shixuyue what exactly are you doing to manage kube-proxy and coredns? |
@stevehipwell I dont have special needs for kube-proxy, but I need to add consul forwarder to coredns, so it can resolve consul endpoints from another "cluster"(its not k8s). |
I see that the docs now contain a method for removing add-ins, but I don't think it is possible to do this without removing the default config. This could be useful if there was valid Helm charts for coredns and kube-proxy in aws/eks-charts (or instructions for using the official coredns Helm chart). |
oh, yea, my hacky workaround works for me. And each time we want to update the plugins, we need to enable the permissions that we just disabled. And once the update is done, we will have to disable it again. So add-on manager doesnt have the permission to revert corefile config map to default. Which is not ideal, but its easy and simple, good as a temp workaround. |
@tabern any updates? |
@tabern if the aws-vpc-cni could match the output from the helm deployed aws-vpc-cni a lot of people would no longer be seeking this. It requires annotations and labels to be updated (so that Helm will accept ownership). While I do believe allowing customers to choose their own adventure for addon's is a good long term goal, this would be a nice quick win for a lot of people here. Today using terraform we must either split the automation into two steps with a manual intervention in between or get into some pretty ugly custom workflow. My group is specifically just trying to configure custom networking as a component of all new cluster builds. |
@cdobbyn While I agree that updating labels and annotations creates a quick work around, there are already ways to hack around this problem. IMO the topic of this thread should stay focused on the issue that the API should support a bare cluster. Making changes to support making work arounds more convenient I think just obscures the real objective, which is making EKS non opinionated about what services are installed and how. |
@mathewmoon I agree with the goal. EKS clusters should as an advanced option allow us to deploy them bare. I suspect they deploy them with some basics for newcomers. My comment was simply to offer a comment on a quick-win in case detaching these components is more complicated than we know. Re-reading it I recognise it appears as though I wish to alter the course of this issue (I do not). |
omg it's almost a year, when are we expecting this to be released? |
Is there any update on this? |
@tabern Has there been any progress on this? |
Althgou I try to avoid "me too" commands, yes, I was also impacted by this today while recreating a cluster from scratch. We manage CoreDNS ourselves and install Cilium, we have not enabled any add-ons but there they are: aws-node and friends. It's funny because the AWS Console shows the option to install those add-ons, as if they were not already installed. So it seems the legacy and new add-on way are clashing. |
Hi everyone, it's been a few years. That was a really long nap! |
@tabern Thank you for continuing work on this. Cannot wait to see the result! |
Extended Support money has been oiling some gears in the EKS team 👀 We'll be properly bootstrapping clusters in no time boys! |
Buckle up @TarekAS ! |
You can now create Amazon EKS clusters without the default networking add-ons including Amazon VPC CNI, CoreDNS, and kube-proxy. Please check out - https://aws.amazon.com/about-aws/whats-new/2024/06/amazon-eks-cluster-creation-flexibility-networking-add-ons/ To create clusters without the default networking add-ons, use the |
Community Note
Tell us about your request
Essentially, I'm looking for an extra option in the AWS API where EKS is initialized with a completely bare cluster (i.e. no
coredns
,aws-node
, orkube-proxy
deployments / daemonsets). Only the EKS control plane is provided.Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Kubernetes lifecycle management is a problem which many tools are solving / attempting to solve. With Kubernetes objects, there's no easy way to "inherit" an object that already exists, and apply changes over it. If an object exists, and you want to change it without completely deleting / reinstalling it, you either have to (AFAIK):
kubectl edit
orkubectl patch
with the in-place objects to change what you want.kubectl apply
with the new options.In fact, the Kubernetes documentation here talks about the various methods: https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/#in-place-updates-of-resources
With Helm charts this problem is pronounced. If I want to apply a Helm chart, and someone has already applied a Kubernetes YAML manifest manually with similar names, I will get errors with Helm because those objects already exist.
For my company, we want to provision / de-provision EKS clusters with as much automation as possible, but what we find is that there are certain manual steps which must be performed with EKS. To name a few:
ConfigMap
metricskube-proxy
process, we have to update the listen address in theConfigMap
like so:CoreDNS
AWS VPC CNI
Kube-proxy
AWS Auth ConfigMap
ConfigMap
object already exists, so we have to take special care to update it ourselves.All of these (and similar) problems would be solved by simply having a flag to initialize a cluster which is completely empty, and let whatever tools we use internally to build up the cluster as we see fit. This is more of a functionality for power / advanced users, but the use case definitely exists.
Are you currently working around this issue?
We are, but we are either performing these actions manually, or as part of a pipeline. For the case of CoreDNS / AWS VPC CNI / Kube-Proxy, we essentially must store a Kubernetes YAML in our Git repositories which we can point to when running
kubectl delete
.The text was updated successfully, but these errors were encountered: