Skip to content

Commit

Permalink
Change naming of file to match the title of the file
Browse files Browse the repository at this point in the history
  • Loading branch information
jonathan-innis committed Mar 9, 2024
1 parent 34269f2 commit c509f3a
Show file tree
Hide file tree
Showing 9 changed files with 339 additions and 351 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,6 @@ However, with this comes the risk that the new AMI could break or degrade your w
As the Karpenter team looks for new ways to manage AMIs, the options below offer some means of reducing these risks, based on your own security and ease-of-use requirements.
Here are the advantages and challenges of each of the options described below:





* [Option 1]({{< relref "#option-1-manage-how-amis-are-tested-and-rolled-out" >}}) (Test AMIs): The safest way, and the one we recommend, for ensuring that a new AMI doesn't break your workloads is to test it before putting it into production. This takes the most effort on your part, but most effectively models how your workloads will run in production, allowing you to catch issues ahead of time. Note that you can sometimes get different results from your test environment when you roll a new AMI into production, since issues like scale and other factors can elevate problems you might not see in test. So combining this with other options, that do things like slow rollouts, can allow you to catch problems before they impact your whole cluster.
* [Option 2]({{< relref "#option-2-lock-down-which-amis-are-selected" >}}) (Lock down AMIs): If workloads require a particluar AMI, this option can make sure that it is the only AMI used by Karpenter. This can be used in combination with Option 1, where you lock down the AMI in production, but allow the newest AMIs in a test cluster while you test your workloads before upgrading production. Keep in mind that this makes upgrades a manual process for you.
* [Option 3]({{< relref "#option-3-control-the-pace-of-node-disruptions" >}}) ([Disruption budgets]({{< relref "../concepts/disruption/" >}})): This option can be used as a way of mitigating the scope of impact if a new AMI causes problems with your workloads. With Disruption budgets you can slow the pace of upgrades to nodes with new AMIs or make sure that upgrades only happen during selected dates and times (using `schedule`). This doesn't prevent a bad AMI from being deployed, but it allows you to control when nodes are upgraded, and gives you more time respond to rollout issues.
Expand All @@ -60,7 +56,8 @@ Instead of just avoiding AMI upgrades, you can set up test clusters where you ca
For example, you could have:

* **Test clusters**: On lower environment clusters, you can run the latest AMIs for your workloads in a safe environment. The `EC2NodeClass` for these clusters could be set with a chosen `amiFamily`, but no `amiSelectorTerms` set. For example, the `NodePool` and `EC2NodeClass` could begin with the following:
```bash

```yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
Expand Down Expand Up @@ -92,17 +89,17 @@ This prevents a new and potentially untested AMI from replacing existing nodes w
With the `amiSelectorTerms` field in an `EC2NodeClass`, you can set a specific AMI for Karpenter to use, based on AMI name or id (only one is required).
These examples show two different ways to identify the same AMI:

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
- name: al2023-ami-2023.3.20240219.0-kernel-6.1-x86_64
```
or

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
Expand All @@ -121,7 +118,7 @@ You can prevent disruption based on nodes (a percentage or number of nodes that
You can set Disruption Budgets in a `NodePool` spec.
Here is an example:

```bash
```yaml
disruption:
consolidationPolicy: WhenEmpty
expireAfter: 1440h
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,6 @@ However, with this comes the risk that the new AMI could break or degrade your w
As the Karpenter team looks for new ways to manage AMIs, the options below offer some means of reducing these risks, based on your own security and ease-of-use requirements.
Here are the advantages and challenges of each of the options described below:





* [Option 1]({{< relref "#option-1-manage-how-amis-are-tested-and-rolled-out" >}}) (Test AMIs): The safest way, and the one we recommend, for ensuring that a new AMI doesn't break your workloads is to test it before putting it into production. This takes the most effort on your part, but most effectively models how your workloads will run in production, allowing you to catch issues ahead of time. Note that you can sometimes get different results from your test environment when you roll a new AMI into production, since issues like scale and other factors can elevate problems you might not see in test. So combining this with other options, that do things like slow rollouts, can allow you to catch problems before they impact your whole cluster.
* [Option 2]({{< relref "#option-2-lock-down-which-amis-are-selected" >}}) (Lock down AMIs): If workloads require a particluar AMI, this option can make sure that it is the only AMI used by Karpenter. This can be used in combination with Option 1, where you lock down the AMI in production, but allow the newest AMIs in a test cluster while you test your workloads before upgrading production. Keep in mind that this makes upgrades a manual process for you.
* [Option 3]({{< relref "#option-3-control-the-pace-of-node-disruptions" >}}) ([Disruption budgets]({{< relref "../concepts/disruption/" >}})): This option can be used as a way of mitigating the scope of impact if a new AMI causes problems with your workloads. With Disruption budgets you can slow the pace of upgrades to nodes with new AMIs or make sure that upgrades only happen during selected dates and times (using `schedule`). This doesn't prevent a bad AMI from being deployed, but it allows you to control when nodes are upgraded, and gives you more time respond to rollout issues.
Expand All @@ -60,7 +56,8 @@ Instead of just avoiding AMI upgrades, you can set up test clusters where you ca
For example, you could have:

* **Test clusters**: On lower environment clusters, you can run the latest AMIs for your workloads in a safe environment. The `EC2NodeClass` for these clusters could be set with a chosen `amiFamily`, but no `amiSelectorTerms` set. For example, the `NodePool` and `EC2NodeClass` could begin with the following:
```bash

```yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
Expand Down Expand Up @@ -92,17 +89,17 @@ This prevents a new and potentially untested AMI from replacing existing nodes w
With the `amiSelectorTerms` field in an `EC2NodeClass`, you can set a specific AMI for Karpenter to use, based on AMI name or id (only one is required).
These examples show two different ways to identify the same AMI:

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
- name: al2023-ami-2023.3.20240219.0-kernel-6.1-x86_64
```
or

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
Expand All @@ -121,7 +118,7 @@ You can prevent disruption based on nodes (a percentage or number of nodes that
You can set Disruption Budgets in a `NodePool` spec.
Here is an example:

```bash
```yaml
disruption:
consolidationPolicy: WhenEmpty
expireAfter: 1440h
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,6 @@ However, with this comes the risk that the new AMI could break or degrade your w
As the Karpenter team looks for new ways to manage AMIs, the options below offer some means of reducing these risks, based on your own security and ease-of-use requirements.
Here are the advantages and challenges of each of the options described below:





* [Option 1]({{< relref "#option-1-manage-how-amis-are-tested-and-rolled-out" >}}) (Test AMIs): The safest way, and the one we recommend, for ensuring that a new AMI doesn't break your workloads is to test it before putting it into production. This takes the most effort on your part, but most effectively models how your workloads will run in production, allowing you to catch issues ahead of time. Note that you can sometimes get different results from your test environment when you roll a new AMI into production, since issues like scale and other factors can elevate problems you might not see in test. So combining this with other options, that do things like slow rollouts, can allow you to catch problems before they impact your whole cluster.
* [Option 2]({{< relref "#option-2-lock-down-which-amis-are-selected" >}}) (Lock down AMIs): If workloads require a particluar AMI, this option can make sure that it is the only AMI used by Karpenter. This can be used in combination with Option 1, where you lock down the AMI in production, but allow the newest AMIs in a test cluster while you test your workloads before upgrading production. Keep in mind that this makes upgrades a manual process for you.
* [Option 3]({{< relref "#option-3-control-the-pace-of-node-disruptions" >}}) ([Disruption budgets]({{< relref "../concepts/disruption/" >}})): This option can be used as a way of mitigating the scope of impact if a new AMI causes problems with your workloads. With Disruption budgets you can slow the pace of upgrades to nodes with new AMIs or make sure that upgrades only happen during selected dates and times (using `schedule`). This doesn't prevent a bad AMI from being deployed, but it allows you to control when nodes are upgraded, and gives you more time respond to rollout issues.
Expand All @@ -60,7 +56,8 @@ Instead of just avoiding AMI upgrades, you can set up test clusters where you ca
For example, you could have:

* **Test clusters**: On lower environment clusters, you can run the latest AMIs for your workloads in a safe environment. The `EC2NodeClass` for these clusters could be set with a chosen `amiFamily`, but no `amiSelectorTerms` set. For example, the `NodePool` and `EC2NodeClass` could begin with the following:
```bash

```yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
Expand Down Expand Up @@ -92,17 +89,17 @@ This prevents a new and potentially untested AMI from replacing existing nodes w
With the `amiSelectorTerms` field in an `EC2NodeClass`, you can set a specific AMI for Karpenter to use, based on AMI name or id (only one is required).
These examples show two different ways to identify the same AMI:

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
- name: al2023-ami-2023.3.20240219.0-kernel-6.1-x86_64
```
or

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
Expand All @@ -121,7 +118,7 @@ You can prevent disruption based on nodes (a percentage or number of nodes that
You can set Disruption Budgets in a `NodePool` spec.
Here is an example:

```bash
```yaml
disruption:
consolidationPolicy: WhenEmpty
expireAfter: 1440h
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,6 @@ However, with this comes the risk that the new AMI could break or degrade your w
As the Karpenter team looks for new ways to manage AMIs, the options below offer some means of reducing these risks, based on your own security and ease-of-use requirements.
Here are the advantages and challenges of each of the options described below:





* [Option 1]({{< relref "#option-1-manage-how-amis-are-tested-and-rolled-out" >}}) (Test AMIs): The safest way, and the one we recommend, for ensuring that a new AMI doesn't break your workloads is to test it before putting it into production. This takes the most effort on your part, but most effectively models how your workloads will run in production, allowing you to catch issues ahead of time. Note that you can sometimes get different results from your test environment when you roll a new AMI into production, since issues like scale and other factors can elevate problems you might not see in test. So combining this with other options, that do things like slow rollouts, can allow you to catch problems before they impact your whole cluster.
* [Option 2]({{< relref "#option-2-lock-down-which-amis-are-selected" >}}) (Lock down AMIs): If workloads require a particluar AMI, this option can make sure that it is the only AMI used by Karpenter. This can be used in combination with Option 1, where you lock down the AMI in production, but allow the newest AMIs in a test cluster while you test your workloads before upgrading production. Keep in mind that this makes upgrades a manual process for you.
* [Option 3]({{< relref "#option-3-control-the-pace-of-node-disruptions" >}}) ([Disruption budgets]({{< relref "../concepts/disruption/" >}})): This option can be used as a way of mitigating the scope of impact if a new AMI causes problems with your workloads. With Disruption budgets you can slow the pace of upgrades to nodes with new AMIs or make sure that upgrades only happen during selected dates and times (using `schedule`). This doesn't prevent a bad AMI from being deployed, but it allows you to control when nodes are upgraded, and gives you more time respond to rollout issues.
Expand All @@ -60,7 +56,8 @@ Instead of just avoiding AMI upgrades, you can set up test clusters where you ca
For example, you could have:

* **Test clusters**: On lower environment clusters, you can run the latest AMIs for your workloads in a safe environment. The `EC2NodeClass` for these clusters could be set with a chosen `amiFamily`, but no `amiSelectorTerms` set. For example, the `NodePool` and `EC2NodeClass` could begin with the following:
```bash

```yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
Expand Down Expand Up @@ -92,17 +89,17 @@ This prevents a new and potentially untested AMI from replacing existing nodes w
With the `amiSelectorTerms` field in an `EC2NodeClass`, you can set a specific AMI for Karpenter to use, based on AMI name or id (only one is required).
These examples show two different ways to identify the same AMI:

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
- name: al2023-ami-2023.3.20240219.0-kernel-6.1-x86_64
```
or

```bash
amiSelectorTerms:
```yaml
amiSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
environment: prod
Expand All @@ -121,7 +118,7 @@ You can prevent disruption based on nodes (a percentage or number of nodes that
You can set Disruption Budgets in a `NodePool` spec.
Here is an example:

```bash
```yaml
disruption:
consolidationPolicy: WhenEmpty
expireAfter: 1440h
Expand Down
Loading

0 comments on commit c509f3a

Please sign in to comment.