Apiserver nodes #10722

olemarkus · 2021-02-04T08:06:25Z

This PR introduces an additional node role that only runs kube-apiserver.
The control plane nodes still run kube-apiserver for bootstrapping the cluster and get dns-controller running. Then the additional api-servers join the cluster at the same time as worker nodes.

In order to distinguish API vs etcd clusters, new etcd cluster-specific DNS entries are created and used by apiserver nodes.

Current implementation grants apiserver nodes access to S3 to self-provision certs. Technically kube-controller could provision these certs, but that would require changes to kube-controller so it only provisions kube-apiserver certs to apiserver role.

Apiservers also reuse the master SG, but should really have a tighter dedicated SG at some point.

Future improvements could be being able to attach ASG scaling policies to it so these ASG scales with load.

k8s-ci-robot · 2021-02-04T08:06:26Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

olemarkus · 2021-02-04T08:28:50Z

/cc @rifelpet

olemarkus · 2021-02-04T08:30:51Z

/milestone v1.21

seh · 2021-02-04T15:59:28Z

Future improvements could be being able to attach ASG scaling policies to it so these ASG scales with load.

I've done this in the past based on CloudWatch metrics for CPU utilization. It was hard to get it right, given the many-minute delay in new machines starting up and being ready to serve. I found that often the CPU usage among the three or more API servers was inconsistent; one of them may be much more busy than the rest. I had been using the average utilization across the machines in the ASGs.

Also, though I may have been doing something incorrectly, I recall that setting the CPU usage threshold correctly in CloudWatch required me to know how many CPUs were available on these machines. It wasn't possible to express something like "60% of the available CPUs" so that the rules would work irrespective of the machine type.

If I recall correctly, the policy worked something like this:

If the average CPU utilization was between 25% and 40%, increase the ASG size by 33%.
If the average CPU utilization was 40% or higher, increase the ASG size by 50%.
If the average CPU utilization was 10% or lower for 20 minutes, decrease the ASG size by one.

olemarkus · 2021-02-04T18:18:21Z

Future improvements could be being able to attach ASG scaling policies to it so these ASG scales with load.

I've done this in the past based on CloudWatch metrics for CPU utilization. It was hard to get it right, given the many-minute delay in new machines starting up and being ready to serve. I found that often the CPU usage among the three or more API servers was inconsistent; one of them may be much more busy than the rest. I had been using the average utilization across the machines in the ASGs.

Also, though I may have been doing something incorrectly, I recall that setting the CPU usage threshold correctly in CloudWatch required me to know how many CPUs were available on these machines. It wasn't possible to express something like "60% of the available CPUs" so that the rules would work irrespective of the machine type.

Yeah, so where I work we never used ASG policies at all, but wrote our own custom controller that watch various sources (in our case, is there an upcoming game in an important league etc) or prometheus.

Since we have metrics-server as an addon, we could let kops-controller just watch the metrics API (or scrape prom metrics such as API and set the desired flag.

Either way, that is for a follow-up :)

olemarkus · 2021-03-18T21:01:12Z

All changes that would affect current clusters is now behind feature flag (as proven by passing the origin/master golden outputs).

Ready to merge from my side.

olemarkus · 2021-03-19T09:25:00Z

/retest

seh

I made a few small suggestions and asked a few questions, but this looks like it's going to work, with some room left for improvement in the future.

seh · 2021-03-19T12:39:25Z

docs/cli/kops_rolling-update_cluster.md

@@ -75,7 +75,7 @@ kops rolling-update cluster [flags]
      --force                          Force rolling update, even if no changes
  -h, --help                           help for cluster
      --instance-group strings         List of instance groups to update (defaults to all if not specified)
-      --instance-group-roles strings   If specified, only instance groups of the specified role will be updated (e.g. Master,Node,Bastion)
+      --instance-group-roles strings   If specified, only instance groups of the specified role will be updated (e.g. Master,Apiserver,Node,Bastion)


Why do we write "Apiserver" for role name for kops rolling-update cluster, but write "APIServer" for the role name for kops create instancegroup?

seh · 2021-03-19T12:39:49Z

docs/cli/kops_create_instancegroup.md

@@ -36,7 +36,7 @@ kops create instancegroup [flags]
      --edit             If true, an editor will be opened to edit default values. (default true)
  -h, --help             help for instancegroup
  -o, --output string    Output format. One of json|yaml
-      --role string      Type of instance group to create (Node,Master,Bastion) (default "Node")
+      --role string      Type of instance group to create (Master,APIServer,Node,Bastion) (default "Node")


Why do we write "APIServer" for the role name for kops create instancegroup, but write "Apiserver" for role name for kops rolling-update cluster?

seh · 2021-03-19T12:40:37Z

cmd/kops/rollingupdatecluster.go

@@ -189,7 +188,7 @@ func NewCmdRollingUpdateCluster(f *util.Factory, out io.Writer) *cobra.Command {
 	cmd.Flags().DurationVar(&options.PostDrainDelay, "post-drain-delay", options.PostDrainDelay, "Time to wait after draining each node")
 	cmd.Flags().BoolVarP(&options.Interactive, "interactive", "i", options.Interactive, "Prompt to continue after each instance is updated")
 	cmd.Flags().StringSliceVar(&options.InstanceGroups, "instance-group", options.InstanceGroups, "List of instance groups to update (defaults to all if not specified)")
-	cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated (e.g. Master,Node,Bastion)")
+	cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated (e.g. Master,Apiserver,Node,Bastion)")


Why do we write "Apiserver" for role name for kops rolling-update cluster, but write "APIServer" for the role name for kops create instancegroup?

nodeup/pkg/model/etcd_manager_tls.go

seh · 2021-03-19T12:51:52Z

pkg/nodelabels/builder_test.go

@@ -63,7 +63,8 @@ func TestBuildNodeLabels(t *testing.T) {
 			expected: map[string]string{
 				RoleLabelMaster16:       "",
 				RoleLabelControlPlane20: "",
-				RoleLabelName15:         RoleMasterLabelValue15,
+				//RoleLabelAPIServer16:    "",


Did you mean to leave this commented?

Yes. The test runs without APIServerNodes feature flag. Will remove the comment when we remove/enable the ff by defult

cmd/kops/rollingupdatecluster.go

seh · 2021-03-19T13:08:51Z

cmd/kops/rollingupdatecluster.go

@@ -188,7 +194,7 @@ func NewCmdRollingUpdateCluster(f *util.Factory, out io.Writer) *cobra.Command {
 	cmd.Flags().DurationVar(&options.PostDrainDelay, "post-drain-delay", options.PostDrainDelay, "Time to wait after draining each node")
 	cmd.Flags().BoolVarP(&options.Interactive, "interactive", "i", options.Interactive, "Prompt to continue after each instance is updated")
 	cmd.Flags().StringSliceVar(&options.InstanceGroups, "instance-group", options.InstanceGroups, "List of instance groups to update (defaults to all if not specified)")
-	cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated (e.g. Master,Apiserver,Node,Bastion)")
+	cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated ("+strings.Join(allRoles, ",")+")")


olemarkus · 2021-03-19T14:11:56Z

/retest

…r nodes Ensure apiserver role can only be used on AWS (because of firewalling) Apply api-server label to CP as well Consolidate node not ready validation message Guard apiserver nodes with a feature flag Rename Apiserver role to APIServer Add an integration test for apiserver nodes Rename Apiserver role to APIServer Enumerate all roles in rolling update docs Apply suggestions from code review Co-authored-by: Steven E. Harris <[email protected]>

justinsb · 2021-03-20T19:10:07Z

pkg/nodelabels/builder.go

 		}
+		nodeLabels[RoleLabelName15] = RoleAPIServerLabelValue15


I guess the idea that this is overridden in the isControlPlane case

justinsb · 2021-03-20T20:19:09Z

This LGTM - thanks for the tweaks and the feature flag (which lets us tweak some of the last unsettled points, like apiserver vs api-server!)

justinsb · 2021-03-20T20:19:14Z

/approve
/lgtm

k8s-ci-robot · 2021-03-20T20:19:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, seh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [justinsb]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

rifelpet · 2021-03-20T22:23:52Z

/retest

we should definitely get an e2e job setup to test this

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 4, 2021

k8s-ci-robot requested review from hakman and KashifSaadat February 4, 2021 08:06

k8s-ci-robot added area/addons area/api approved Indicates a PR has been approved by an approver from all required OWNERS files. area/documentation area/nodeup area/provider/aws Issues or PRs related to aws provider area/rolling-update labels Feb 4, 2021

olemarkus force-pushed the apiserver-nodes branch from a437632 to 3b501c2 Compare February 4, 2021 08:06

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 4, 2021

olemarkus force-pushed the apiserver-nodes branch 3 times, most recently from dd1ca83 to 67cebaf Compare February 4, 2021 08:25

k8s-ci-robot requested a review from rifelpet February 4, 2021 08:28

olemarkus marked this pull request as ready for review February 4, 2021 08:29

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2021

k8s-ci-robot added this to the v1.21 milestone Feb 4, 2021

olemarkus force-pushed the apiserver-nodes branch 2 times, most recently from ef1275a to f6377e4 Compare February 4, 2021 11:19

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 13, 2021

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 18, 2021

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 19, 2021

olemarkus requested review from justinsb, rifelpet and seh March 19, 2021 07:47

olemarkus force-pushed the apiserver-nodes branch from 75b2307 to 06d6540 Compare March 19, 2021 08:50

seh approved these changes Mar 19, 2021

View reviewed changes

k8s-ci-robot assigned seh Mar 19, 2021

k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 19, 2021

seh approved these changes Mar 19, 2021

View reviewed changes

k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 19, 2021

olemarkus force-pushed the apiserver-nodes branch from 4905fe3 to 3953c62 Compare March 19, 2021 13:16

olemarkus force-pushed the apiserver-nodes branch from 3953c62 to 51c42bd Compare March 20, 2021 19:56

olemarkus force-pushed the apiserver-nodes branch from 51c42bd to 20bd724 Compare March 20, 2021 19:57

justinsb reviewed Mar 20, 2021

View reviewed changes

k8s-ci-robot assigned justinsb Mar 20, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 20, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 20, 2021

k8s-ci-robot merged commit 15e4028 into kubernetes:master Mar 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apiserver nodes #10722

Apiserver nodes #10722

olemarkus commented Feb 4, 2021

k8s-ci-robot commented Feb 4, 2021

olemarkus commented Feb 4, 2021

olemarkus commented Feb 4, 2021

seh commented Feb 4, 2021

olemarkus commented Feb 4, 2021

olemarkus commented Mar 18, 2021

olemarkus commented Mar 19, 2021

seh left a comment

seh Mar 19, 2021

seh Mar 19, 2021

seh Mar 19, 2021

seh Mar 19, 2021

olemarkus Mar 19, 2021

seh Mar 19, 2021

olemarkus commented Mar 19, 2021

justinsb Mar 20, 2021

justinsb commented Mar 20, 2021

justinsb commented Mar 20, 2021

k8s-ci-robot commented Mar 20, 2021

rifelpet commented Mar 20, 2021

Apiserver nodes #10722

Apiserver nodes #10722

Conversation

olemarkus commented Feb 4, 2021

k8s-ci-robot commented Feb 4, 2021

olemarkus commented Feb 4, 2021

olemarkus commented Feb 4, 2021

seh commented Feb 4, 2021

olemarkus commented Feb 4, 2021

olemarkus commented Mar 18, 2021

olemarkus commented Mar 19, 2021

seh left a comment

Choose a reason for hiding this comment

seh Mar 19, 2021

Choose a reason for hiding this comment

seh Mar 19, 2021

Choose a reason for hiding this comment

seh Mar 19, 2021

Choose a reason for hiding this comment

seh Mar 19, 2021

Choose a reason for hiding this comment

olemarkus Mar 19, 2021

Choose a reason for hiding this comment

seh Mar 19, 2021

Choose a reason for hiding this comment

olemarkus commented Mar 19, 2021

justinsb Mar 20, 2021

Choose a reason for hiding this comment

justinsb commented Mar 20, 2021

justinsb commented Mar 20, 2021

k8s-ci-robot commented Mar 20, 2021

rifelpet commented Mar 20, 2021