Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apiserver nodes #10722

Merged
merged 1 commit into from
Mar 20, 2021
Merged

Apiserver nodes #10722

merged 1 commit into from
Mar 20, 2021

Conversation

olemarkus
Copy link
Member

This PR introduces an additional node role that only runs kube-apiserver.
The control plane nodes still run kube-apiserver for bootstrapping the cluster and get dns-controller running. Then the additional api-servers join the cluster at the same time as worker nodes.

In order to distinguish API vs etcd clusters, new etcd cluster-specific DNS entries are created and used by apiserver nodes.

Current implementation grants apiserver nodes access to S3 to self-provision certs. Technically kube-controller could provision these certs, but that would require changes to kube-controller so it only provisions kube-apiserver certs to apiserver role.

Apiservers also reuse the master SG, but should really have a tighter dedicated SG at some point.

Future improvements could be being able to attach ASG scaling policies to it so these ASG scales with load.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 4, 2021
@k8s-ci-robot k8s-ci-robot added area/addons area/api approved Indicates a PR has been approved by an approver from all required OWNERS files. area/documentation area/nodeup area/provider/aws Issues or PRs related to aws provider area/rolling-update labels Feb 4, 2021
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 4, 2021
@olemarkus olemarkus force-pushed the apiserver-nodes branch 3 times, most recently from dd1ca83 to 67cebaf Compare February 4, 2021 08:25
@olemarkus
Copy link
Member Author

/cc @rifelpet

@k8s-ci-robot k8s-ci-robot requested a review from rifelpet February 4, 2021 08:28
@olemarkus olemarkus marked this pull request as ready for review February 4, 2021 08:29
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2021
@olemarkus
Copy link
Member Author

/milestone v1.21

@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Feb 4, 2021
@olemarkus olemarkus force-pushed the apiserver-nodes branch 2 times, most recently from ef1275a to f6377e4 Compare February 4, 2021 11:19
@seh
Copy link
Contributor

seh commented Feb 4, 2021

Future improvements could be being able to attach ASG scaling policies to it so these ASG scales with load.

I've done this in the past based on CloudWatch metrics for CPU utilization. It was hard to get it right, given the many-minute delay in new machines starting up and being ready to serve. I found that often the CPU usage among the three or more API servers was inconsistent; one of them may be much more busy than the rest. I had been using the average utilization across the machines in the ASGs.

Also, though I may have been doing something incorrectly, I recall that setting the CPU usage threshold correctly in CloudWatch required me to know how many CPUs were available on these machines. It wasn't possible to express something like "60% of the available CPUs" so that the rules would work irrespective of the machine type.

If I recall correctly, the policy worked something like this:

  • If the average CPU utilization was between 25% and 40%, increase the ASG size by 33%.
  • If the average CPU utilization was 40% or higher, increase the ASG size by 50%.
  • If the average CPU utilization was 10% or lower for 20 minutes, decrease the ASG size by one.

@olemarkus
Copy link
Member Author

Future improvements could be being able to attach ASG scaling policies to it so these ASG scales with load.

I've done this in the past based on CloudWatch metrics for CPU utilization. It was hard to get it right, given the many-minute delay in new machines starting up and being ready to serve. I found that often the CPU usage among the three or more API servers was inconsistent; one of them may be much more busy than the rest. I had been using the average utilization across the machines in the ASGs.

Also, though I may have been doing something incorrectly, I recall that setting the CPU usage threshold correctly in CloudWatch required me to know how many CPUs were available on these machines. It wasn't possible to express something like "60% of the available CPUs" so that the rules would work irrespective of the machine type.

Yeah, so where I work we never used ASG policies at all, but wrote our own custom controller that watch various sources (in our case, is there an upcoming game in an important league etc) or prometheus.

Since we have metrics-server as an addon, we could let kops-controller just watch the metrics API (or scrape prom metrics such as API and set the desired flag.

Either way, that is for a follow-up :)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 13, 2021
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 18, 2021
@olemarkus
Copy link
Member Author

All changes that would affect current clusters is now behind feature flag (as proven by passing the origin/master golden outputs).

Ready to merge from my side.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 19, 2021
@olemarkus olemarkus requested review from justinsb, rifelpet and seh March 19, 2021 07:47
@olemarkus
Copy link
Member Author

/retest

Copy link
Contributor

@seh seh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few small suggestions and asked a few questions, but this looks like it's going to work, with some room left for improvement in the future.

@@ -75,7 +75,7 @@ kops rolling-update cluster [flags]
--force Force rolling update, even if no changes
-h, --help help for cluster
--instance-group strings List of instance groups to update (defaults to all if not specified)
--instance-group-roles strings If specified, only instance groups of the specified role will be updated (e.g. Master,Node,Bastion)
--instance-group-roles strings If specified, only instance groups of the specified role will be updated (e.g. Master,Apiserver,Node,Bastion)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we write "Apiserver" for role name for kops rolling-update cluster, but write "APIServer" for the role name for kops create instancegroup?

@@ -36,7 +36,7 @@ kops create instancegroup [flags]
--edit If true, an editor will be opened to edit default values. (default true)
-h, --help help for instancegroup
-o, --output string Output format. One of json|yaml
--role string Type of instance group to create (Node,Master,Bastion) (default "Node")
--role string Type of instance group to create (Master,APIServer,Node,Bastion) (default "Node")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we write "APIServer" for the role name for kops create instancegroup, but write "Apiserver" for role name for kops rolling-update cluster?

@@ -189,7 +188,7 @@ func NewCmdRollingUpdateCluster(f *util.Factory, out io.Writer) *cobra.Command {
cmd.Flags().DurationVar(&options.PostDrainDelay, "post-drain-delay", options.PostDrainDelay, "Time to wait after draining each node")
cmd.Flags().BoolVarP(&options.Interactive, "interactive", "i", options.Interactive, "Prompt to continue after each instance is updated")
cmd.Flags().StringSliceVar(&options.InstanceGroups, "instance-group", options.InstanceGroups, "List of instance groups to update (defaults to all if not specified)")
cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated (e.g. Master,Node,Bastion)")
cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated (e.g. Master,Apiserver,Node,Bastion)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we write "Apiserver" for role name for kops rolling-update cluster, but write "APIServer" for the role name for kops create instancegroup?

nodeup/pkg/model/etcd_manager_tls.go Outdated Show resolved Hide resolved
nodeup/pkg/model/etcd_manager_tls.go Outdated Show resolved Hide resolved
@@ -63,7 +63,8 @@ func TestBuildNodeLabels(t *testing.T) {
expected: map[string]string{
RoleLabelMaster16: "",
RoleLabelControlPlane20: "",
RoleLabelName15: RoleMasterLabelValue15,
//RoleLabelAPIServer16: "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to leave this commented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The test runs without APIServerNodes feature flag. Will remove the comment when we remove/enable the ff by defult

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 19, 2021
cmd/kops/rollingupdatecluster.go Outdated Show resolved Hide resolved
@@ -188,7 +194,7 @@ func NewCmdRollingUpdateCluster(f *util.Factory, out io.Writer) *cobra.Command {
cmd.Flags().DurationVar(&options.PostDrainDelay, "post-drain-delay", options.PostDrainDelay, "Time to wait after draining each node")
cmd.Flags().BoolVarP(&options.Interactive, "interactive", "i", options.Interactive, "Prompt to continue after each instance is updated")
cmd.Flags().StringSliceVar(&options.InstanceGroups, "instance-group", options.InstanceGroups, "List of instance groups to update (defaults to all if not specified)")
cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated (e.g. Master,Apiserver,Node,Bastion)")
cmd.Flags().StringSliceVar(&options.InstanceGroupRoles, "instance-group-roles", options.InstanceGroupRoles, "If specified, only instance groups of the specified role will be updated ("+strings.Join(allRoles, ",")+")")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever!

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 19, 2021
@olemarkus
Copy link
Member Author

/retest

…r nodes

Ensure apiserver role can only be used on AWS (because of firewalling)

Apply api-server label to CP as well

Consolidate node not ready validation message

Guard apiserver nodes with a feature flag

Rename Apiserver role to APIServer

Add an integration test for apiserver nodes

Rename Apiserver role to APIServer

Enumerate all roles in rolling update docs

Apply suggestions from code review

Co-authored-by: Steven E. Harris <[email protected]>
}
nodeLabels[RoleLabelName15] = RoleAPIServerLabelValue15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the idea that this is overridden in the isControlPlane case

@justinsb
Copy link
Member

This LGTM - thanks for the tweaks and the feature flag (which lets us tweak some of the last unsettled points, like apiserver vs api-server!)

@justinsb
Copy link
Member

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 20, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, seh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 20, 2021
@rifelpet
Copy link
Member

/retest

we should definitely get an e2e job setup to test this

@k8s-ci-robot k8s-ci-robot merged commit 15e4028 into kubernetes:master Mar 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons area/api area/channels area/documentation area/nodeup area/provider/aws Issues or PRs related to aws provider area/provider/openstack Issues or PRs related to openstack provider area/rolling-update cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants