-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k3s --disable-agent flag never starts kube-scheduler in newer k3s versions #5118
Comments
Hmm, in your vcluster use case you're using the default scheduler, but do not ever have any nodes? How does that work exactly - wouldn't that leave the scheduler without any nodes to schedule to? |
@brandond thanks for the reply! Currently we use the scheduler of the underlying host cluster to decide where a pod should be scheduled on and then sync back the node into the virtual k3s cluster. In an effort to allow users to taint and label nodes within the virtual cluster and move vcluster closer to the behaviour of a real Kubernetes cluster on the scheduling features, we actually want to enable the scheduler inside the virtual k3s cluster, let it decide on which node a pod should be scheduled and then create the pod in the underlying host cluster bound to the scheduled node already. This works because we sync the nodes from the host cluster into the virtual one by creating the node objects in there without actually installing a separate kubelet or kube proxy on them, which is why we don't need the agent of k3s at all. Rather we only need the control plane part (kube api server, storage, controller manager and scheduler) which is virtualized completely in vcluster and works like in a normal Kubernetes cluster, while the workloads will then be executed on the host cluster nodes where we create pods in the host cluster that map to pods in the virtual cluster. |
In that case, I'm not sure we need to change anything - kube-scheduler (in its current state) should start up as soon as an untainted node is sync'd into the virtual K3s cluster. |
@brandond but this condition will never be true if you use k3s/pkg/daemons/executor/embed.go Lines 113 to 116 in feb6fee
|
Ahh, I see. Sorry, I'd missed that part; I thought it was just waiting on a node to show up. |
See if that PR fixes it for you? |
@brandond thanks a lot for the quick PR! It works for me without running k3s in an unprivileged docker container, but if I run k3s inside a container I get the following errors:
Seems like the problem is this part here that is now executed to retrieve the agent node config: k3s/pkg/agent/config/config.go Lines 440 to 459 in bb856c6
I'm no expert here, but wouldn't it be much easier to use the kube scheduler kube config here instead of initializing the whole agent config? Or is the node kube config required here? I though about something like this:
|
I was hoping to avoid having to pass that in explicitly since the nodeconfig already has all the various bits of information we need filled in properly, if we properly bootstrap the executor before using it. |
Should be sorted now; I am able to run the server in an unprivileged container. Even rootless should work, if you give it a writable path for $HOME: docker run --rm -it --user 1000:1000 -e HOME=/tmp/k3s rancher/k3s server --disable-agent --token=token --rootless |
@brandond just verified it and it works perfectly now, thanks so much for the quick fix! |
Validated on v1.23.5-rc1+k3s1
Joined an agent node
Metrics server fails to fetch metrics in the above setup as shared in the issue #5330 |
Environmental Info:
K3s Version:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
Describe the bug:
Hello! Thanks again a lot for the great project! This is a problem related to using
--disable-agent
with k3s.The PR #4345 changed that the kube scheduler is only started if the
nodeConfig
in the embedded executer is set, which never happens, because the agent is never started, which in turn leads to kube-scheduler never starting.You can see the problematic code at:
k3s/pkg/daemons/executor/embed.go
Lines 113 to 124 in feb6fee
And the agent bootstrap that is skipped at:
k3s/pkg/cli/server/server.go
Lines 451 to 455 in bb856c6
I know
--disable-agent
its an unsupported flag, but since we are relying on it for correct functionality in vcluster, I hope you could consider fixing this as it has worked before and it would be in my opinion a minimal non-invasive change to k3s. If you decide to go forward with this change, I'm willing to submit a PR that fixes this as well.Steps To Reproduce:
Specify the
--disable-agent
flag and notice kube-scheduler is never getting startedExpected behavior:
Kube scheduler starting up if
--disable-agent
is set to true.Actual behavior:
Kube scheduler never starting up
The text was updated successfully, but these errors were encountered: