Skip to content
This repository has been archived by the owner on Jun 15, 2021. It is now read-only.

Commit

Permalink
Easier to configure, more tightly integrated node pools
Browse files Browse the repository at this point in the history
This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso .
I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment).

After applying this change:

1. All the `kube-aws node-pools` sub-commands are dropped
2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up`
3. You can now update all the sub-clusters including a main cluster and node pool(s) by running  `kube-aws update`
4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy`
5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml`
6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count`
7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped.
8. A typical local file tree would now look like:
  - `cluster.yaml`
  - `stack-templates/` (generated on `kube-aws render`)
     - `root.json.tmpl`
     - `control-plane.json.tmpl`
     - `node-pool.json.tmpl`
  - `userdata/`
     - `cloud-config-worker`
     - `cloud-config-controller`
     - `cloud-config-etcd`
  - `credentials/`
     - *.pem(generated on `kube-aws render`)
     - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`)
  - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`)
     - `stacks/`
       - `control-plane/`
         - `stack.json`
         - `user-data-controller`
       - `<node pool name = stack name>/`
         - `stack.json`
         - `user-data-worker`
9. A typical object tree in S3 would now look like:
  - `<bucket and directory from s3URI>`/
    - kube-aws/
      - clusters/
        - `<cluster name>`/
          - `exported`/
            - `stacks`/
              - `control-plane/`
                - `stack.json`
                - `cloud-config-controller`
              - `<node pool name = stack name>`/
                - `stack.json`

Implementation details:

Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole.
kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks.
kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack.

An example `cluster.yaml`  I've been used to test this looks like:

```yaml
clusterName: <your cluster name>
externalDNSName: <your external dns name>
hostedZoneId: <your hosted zone id>
keyName: <your key name>
kmsKeyArn: <your kms key arn>
region: ap-northeast-1
createRecordSet: true
experimental:
  waitSignal:
    enabled: true
subnets:
- name: private1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
- name: private2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
- name: public1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.3.0/24"
- name: public2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.4.0/24"
controller:
  subnets:
  - name: public1
  - name: public2
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: public1
  - name: public2
worker:
  nodePools:
  - name: pool1
    subnets:
    - name: asgPublic1a
  - name: pool2
    subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284
    - name: asgPublic1c
    instanceType: "c4.large" # former `workerInstanceType` in the top-level
    count: 2 # former `workerCount` in the top-level
    rootVolumeSize: ...
    rootVolumeType: ...
    rootVolumeIOPs: ...
    autoScalingGroup:
      minSize: 0
      maxSize: 10
    waitSignal:
      enabled: true
      maxBatchSize: 2
  - name: spotFleetPublic1a
    subnets:
    - name: public1
    spotFleet:
      targetCapacity: 1
      unitRootVolumeSize: 50
      unitRootvolumeIOPs: 100
      rootVolumeType: gp2
      spotPrice: 0.06
      launchSpecifications:
      - spotPrice: 0.12
         weightedCapacity: 2
         instanceType: m4.xlarge
        rootVolumeType: io1
        rootVolumeIOPs: 200
        rootVolumeSize: 100
```
  • Loading branch information
mumoshu committed Feb 16, 2017
1 parent 99ab3fe commit 7c4f08f
Show file tree
Hide file tree
Showing 78 changed files with 3,999 additions and 3,464 deletions.
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
/bin
/e2e/assets
*~
/config/templates.go
/nodepool/config/templates.go
/core/*/config/templates.go
.idea/
.envrc
coverage.txt
Expand Down
138 changes: 55 additions & 83 deletions Documentation/kubernetes-on-aws-node-pool.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,105 +13,73 @@ Node Pool allows you to bring up additional pools of worker nodes each with a se

## Deploying a Multi-AZ cluster with cluster-autoscaler support with Node Pools

Edit the `cluster.yaml` file to decrease `workerCount`, which is meant to be number of worker nodes in the "main" cluster, down to zero:
kube-aws creates a node pool in a single AZ by default.
On top of that, you can add one or more node pool in an another AZ to achieve Multi-AZ.

```yaml
# `workerCount` should be set to zero explicitly
workerCount: 0
# And the below should be added before recreating the cluster
worker:
autoScalingGroup:
minSize: 0
rollingUpdateMinInstancesInService: 0

subnets:
- availabilityZone: us-west-1a
instanceCIDR: "10.0.0.0/24"
```
`kube-aws update` doesn't work when decreasing number of workers down to zero as of today.
Therefore, don't update but recreate the main cluster to catch up changes made in `cluster.yaml`:

```
$ kube-aws destroy
$ kube-aws up \
--s3-uri s3://<my-bucket>/<optional-prefix>
```

Create two node pools, each with a different subnet and an availability zone:

```
$ kube-aws node-pools init --node-pool-name first-pool-in-1a \
--availability-zone us-west-1a \
--key-name ${KUBE_AWS_KEY_NAME} \
--kms-key-arn ${KUBE_AWS_KMS_KEY_ARN}
$ kube-aws node-pools init --node-pool-name second-pool-in-1b \
--availability-zone us-west-1b \
--key-name ${KUBE_AWS_KEY_NAME} \
--kms-key-arn ${KUBE_AWS_KMS_KEY_ARN}
```

Edit the `cluster.yaml` for the first zone:

```
$ $EDITOR node-pools/first-pool-in-1a/cluster.yaml
```
Assuming you already have a subnet and a node pool in the subnet:

```yaml
workerCount: 1
subnets:
- availabilityZone: us-west-1a
instanceCIDR: "10.0.1.0/24"
- name: managedPublicSubnetIn1a
availabilityZone: us-west-1a
instanceCIDR: 10.0.0.0/24

worker:
nodePools:
- name: pool1
subnets:
- name: managedPublicSubnetIn1a
```
Edit the `cluster.yaml` for the second zone:
```
$ $EDITOR node-pools/second-pool-in-1b/cluster.yaml
```
Edit the `cluster.yaml` file to add the second node pool:

```yaml
workerCount: 1
subnets:
- availabilityZone: us-west-1b
instanceCIDR: "10.0.2.0/24"
```
- name: managedPublicSubnetIn1a
availabilityZone: us-west-1a
instanceCIDR: 10.0.0.0/24
- name: managedPublicSubnetIn1c
availabilityZone: us-west-1c
instanceCIDR: 10.0.1.0/24
Render the assets for the node pools including [cloud-init](https://github.com/coreos/coreos-cloudinit) cloud-config userdata and [AWS CloudFormation](https://aws.amazon.com/cloudformation/) template:

```
$ kube-aws node-pools render stack --node-pool-name first-pool-in-1a
$ kube-aws node-pools render stack --node-pool-name second-pool-in-1b
worker:
nodePools:
- name: pool1
subnets:
- name: managedPublicSubnetIn1a
- name: pool2
subnets:
- name: managedPublicSubnetIn1c
```

Launch the node pools:
Launch the secondary node pool by running `kube-aws update``:

```
$ kube-aws node-pools up --node-pool-name first-pool-in-1a \
--s3-uri s3://<my-bucket>/<optional-prefix>
$ kube-aws node-pools up --node-pool-name second-pool-in-1b \
$ kube-aws update \
--s3-uri s3://<my-bucket>/<optional-prefix>
```

Deployment of cluster-autoscaler is currently out of scope of this documentation.
Beware that you have to associate only 1 AZ to a node pool or cluster-autoscaler may end up failing to reliably add nodes on demand due to the fact
that what cluster-autoscaler does is to increase/decrease the desired capacity hence it has no way to selectively add node(s) in a desired AZ.

Also note that deployment of cluster-autoscaler is currently out of scope of this documentation.
Please read [cluster-autoscaler's documentation](https://github.com/kubernetes/contrib/blob/master/cluster-autoscaler/cloudprovider/aws/README.md) for instructions on it.

## Customizing min/max size of the auto scaling group

If you've chosen to power your worker nodes in a node pool with an auto scaling group, you can customize `MinSize`, `MaxSize`, `MinInstancesInService` in `cluster.yaml`:
If you've chosen to power your worker nodes in a node pool with an auto scaling group, you can customize `MinSize`, `MaxSize`, `RollingUpdateMinInstancesInService` in `cluster.yaml`:

Please read [the AWS documentation](http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html#aws-properties-as-group-prop) for more information on `MinSize`, `MaxSize`, `MinInstancesInService` for ASGs.

```
worker:
# Auto Scaling Group definition for workers. If only `workerCount` is specified, min and max will be the set to that value and `rollingUpdateMinInstancesInService` will be one less.
autoScalingGroup:
minSize: 1
maxSize: 3
rollingUpdateMinInstancesInService: 2
nodePools:
- name: pool1
autoScalingGroup:
minSize: 1
maxSize: 3
rollingUpdateMinInstancesInService: 2
```
See [the detailed comments in `cluster.yaml`](https://github.com/coreos/kube-aws/blob/master/nodepool/config/templates/cluster.yaml) for further information.
Expand Down Expand Up @@ -142,23 +110,27 @@ To add a node pool powered by Spot Fleet, edit node pool's `cluster.yaml`:
```yaml
worker:
spotFleet:
targetCapacity: 3
nodePools:
- name: pool1
spotFleet:
targetCapacity: 3
```

To customize your launch specifications to diversify your pool among instance types other than the defaults, edit `cluster.yaml`:

```yaml
worker:
spotFleet:
targetCapacity: 5
launchSpecifications:
- weightedCapacity: 1
instanceType: t2.medium
- weightedCapacity: 2
instanceType: m3.large
- weightedCapacity: 2
instanceType: m4.large
nodePools:
- name: pool1
spotFleet:
targetCapacity: 5
launchSpecifications:
- weightedCapacity: 1
instanceType: t2.medium
- weightedCapacity: 2
instanceType: m3.large
- weightedCapacity: 2
instanceType: m4.large
```
This configuration would normally result in Spot Fleet to bring up 3 instances to meet your target capacity:
Expand Down
7 changes: 4 additions & 3 deletions build
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,14 @@ fi

echo Building kube-aws ${VERSION}

go generate ./config
go generate ./nodepool/config
go generate ./core/controlplane/config
go generate ./core/nodepool/config
go generate ./core/root/config

if [[ ! "${BUILD_GOOS:-}" == "" ]];then
export GOOS=$BUILD_GOOS
fi
if [[ ! "${BUILD_GOARCH:-}" == "" ]];then
export GOARCH=$BUILD_GOARCH
fi
go build -ldflags "-X github.com/coreos/kube-aws/cluster.VERSION=${VERSION}" -a -tags netgo -installsuffix netgo -o "$OUTPUT_PATH" ./
go build -ldflags "-X github.com/coreos/kube-aws/core/controlplane/cluster.VERSION=${VERSION}" -a -tags netgo -installsuffix netgo -o "$OUTPUT_PATH" ./
161 changes: 161 additions & 0 deletions cfnstack/assets.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
package cfnstack

import (
"fmt"
"regexp"
)

type Assets interface {
Merge(Assets) Assets
AsMap() map[assetID]Asset
FindAssetByStackAndFileName(string, string) Asset
}

type assetsImpl struct {
underlying map[assetID]Asset
}

type assetID struct {
StackName string
Filename string
}

func NewAssetID(stack string, file string) assetID {
return assetID{
StackName: stack,
Filename: file,
}
}

func (a assetsImpl) Merge(other Assets) Assets {
merged := map[assetID]Asset{}

for k, v := range a.underlying {
merged[k] = v
}
for k, v := range other.AsMap() {
merged[k] = v
}

return assetsImpl{
underlying: merged,
}
}

func (a assetsImpl) AsMap() map[assetID]Asset {
return a.underlying
}

func (a assetsImpl) findAssetByID(id assetID) Asset {
asset, ok := a.underlying[id]
if !ok {
panic(fmt.Sprintf("[bug] failed to get the asset for the id \"%s\"", id))
}
return asset
}

func (a assetsImpl) FindAssetByStackAndFileName(stack string, file string) Asset {
return a.findAssetByID(NewAssetID(stack, file))
}

type AssetsBuilder interface {
Add(filename string, content string) AssetsBuilder
Build() Assets
}

type assetsBuilderImpl struct {
locProvider AssetLocationProvider
assets map[assetID]Asset
}

func (b *assetsBuilderImpl) Add(filename string, content string) AssetsBuilder {
loc, err := b.locProvider.locationFor(filename)
if err != nil {
panic(err)
}
b.assets[loc.ID] = Asset{
AssetLocation: *loc,
Content: content,
}
return b
}

func (b *assetsBuilderImpl) Build() Assets {
return assetsImpl{
underlying: b.assets,
}
}

func NewAssetsBuilder(stackName string, s3URI string) AssetsBuilder {
return &assetsBuilderImpl{
locProvider: AssetLocationProvider{
s3URI: s3URI,
stackName: stackName,
},
assets: map[assetID]Asset{},
}
}

type Asset struct {
AssetLocation
Content string
}

type AssetLocationProvider struct {
s3URI string
stackName string
}

type AssetLocation struct {
ID assetID
Key string
Bucket string
Path string
URL string
}

func newAssetLocationProvider(stackName string, s3URI string) AssetLocationProvider {
return AssetLocationProvider{
s3URI: s3URI,
stackName: stackName,
}
}

func (p AssetLocationProvider) locationFor(filename string) (*AssetLocation, error) {
s3URI := p.s3URI

re := regexp.MustCompile("s3://(?P<bucket>[^/]+)/(?P<directory>.+[^/])/*$")
matches := re.FindStringSubmatch(s3URI)

path := fmt.Sprintf("%s/%s", p.stackName, filename)

var bucket string
var key string
if len(matches) == 3 {
bucket = matches[1]
directory := matches[2]

key = fmt.Sprintf("%s/%s", directory, path)
} else {
re := regexp.MustCompile("s3://(?P<bucket>[^/]+)/*$")
matches := re.FindStringSubmatch(s3URI)

if len(matches) == 2 {
bucket = matches[1]
key = path
} else {
return nil, fmt.Errorf("failed to parse s3 uri(=%s): The valid uri pattern for it is s3://mybucket/mydir or s3://mybucket", s3URI)
}
}

url := fmt.Sprintf("https://s3.amazonaws.com/%s/%s", bucket, key)
id := assetID{StackName: p.stackName, Filename: filename}

return &AssetLocation{
ID: id,
Key: key,
Bucket: bucket,
Path: path,
URL: url,
}, nil
}
Loading

0 comments on commit 7c4f08f

Please sign in to comment.