-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding initial support for bottlerocket without custom launch template #1110
Conversation
✔️ Deploy Preview for karpenter-docs-prod canceled. 🔨 Explore the source changes: a89b7be 🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/6202968d1c529800075f1ba9 |
Go easy on me - new to Karpenter and still a novice at Go itself - but eager to learn + grow! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Some minor code nits, but I think you nailed it in concept. I want to make sure @akestner and @suket22 are feeling good about this API change. We can always refine it since this is an alpha API, but it's good to be careful regardless.
I think we might want use defaulting logic to set eksoptimized in the API when the provisioner is created. This has minor repercussions to existing provisioners, so perhaps we can do in memory defaulting (something we've talked about it the past) to maintain backwards compatibility. Happy to chat more on this on slack.
@@ -25,6 +25,7 @@ var ( | |||
ArchitectureAmd64 = "amd64" | |||
ArchitectureArm64 = "arm64" | |||
OperatingSystemLinux = "linux" | |||
BottleRocket = "bottlerocket" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to keep AWS specific concepts isolated to the cloudprovider/aws package, check out the APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should this be OperatingSystemBottleRocket
to match the linux var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this to cloudprovider/aws/apis/v1alpha1/register.go, renamed to OperatingSystemBottleRocket.
pkg/cloudprovider/aws/ami.go
Outdated
version, err := p.kubeServerVersion(ctx) | ||
if err != nil { | ||
return nil, fmt.Errorf("kube server version, %w", err) | ||
} | ||
// Separate instance types by unique queries | ||
amiQueries := map[string][]cloudprovider.InstanceType{} | ||
for _, instanceType := range instanceTypes { | ||
query := p.getSSMQuery(instanceType, version) | ||
query := p.getSSMQuery(ctx, constraints.AMIFamily, instanceType, version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider passing the full constraints object through.
pkg/cloudprovider/aws/ami.go
Outdated
arch = "arm64" | ||
} | ||
|
||
if amiFamily == v1alpha5.BottleRocket { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on doing a switch statement here for all ami families? We may need this for flatcar, ubuntu, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ellistarn we actually talked about this today at work.
Is there a concept of "release channels" for these OS versions (bottlerocket and amazon linux)? I see AL2's is called "recommended" - does that imply there's a latest that's newer/less tested? Similarly for bottlerocket - does latest imply there's a recommended?
We were also wondering if it might be beneficial to allow users to specify their own SSM query, so they can control their own "release cadence" outside of AWS' as needed (we might want to do something like let our DEV stuff pull latest, but "pin" PROD to something like "stable" that's more battle tested.
I'll push another commit today with these changes. |
Just pushed a big commit that hopefully addresses all the feedback. Please let me know what else I can do - really enjoying working on this! :) Cheers! |
Really nice work so far @rayterrill . A couple of comments and optional suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Just a few comments and hopefully we can get this merged very soon!! :)
pkg/cloudprovider/aws/ami.go
Outdated
if *constraints.AMIFamily == v1alpha1.OperatingSystemBottleRocket { | ||
return fmt.Sprintf("/aws/service/bottlerocket/aws-k8s-%s/%s/latest/image_id", version, arch) | ||
} else if *constraints.AMIFamily == v1alpha1.OperatingSystemEKSOptimized { | ||
return fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2%s/recommended/image_id", version, amiSuffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could put this format string into a const since it's duplicated 3 times. Wouldn't want to accidentally typo one of those :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm following @bwagner5 - can you give me an example? Sorry mate still learning Go so maybe I'm just missing something simple - not sure how a const would work with interpolated values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries!
I had meant something like this:
const (
AL2AliasFormat = "/aws/service/eks/optimized-ami/%s/amazon-linux-2%s/recommended/image_id"
)
...
fmt.Sprintf(AL2AliasFormat, version, amiSuffix)
However, after looking at this again, a better idea is to use a function. The constant alias format string may be confusing since it takes two parameters. Breaking them into distinct functions will hide the complexity of building up the alias, make it easier to test separately, and there's potentially more validation we could do in these funcs in the future.
Can you do something like this?
func (p *AMIProvider) getAL2Alias(version string, instanceType cloudprovider.InstanceType) string {
amiSuffix := ""
if !instanceType.NvidiaGPUs().IsZero() || !instanceType.AWSNeurons().IsZero() {
amiSuffix = "-gpu"
} else if instanceType.Architecture() == v1alpha5.ArchitectureArm64 {
amiSuffix = fmt.Sprintf("-%s", instanceType.Architecture())
}
return fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2%s/recommended/image_id", version, amiSuffix)
}
func (p *AMIProvider) getBottlerocketAlias(version string, instanceType cloudprovider.InstanceType) string {
arch := "x86_64"
if instanceType.Architecture() == v1alpha5.ArchitectureArm64 {
arch = instanceType.Architecture()
}
return fmt.Sprintf("/aws/service/bottlerocket/aws-k8s-%s/%s/latest/image_id", version, arch)
}
This will encapsulate the aliasing logic (like knowing the arch determines the amiSuffix) into a function that returns the properly formatted alias.
Just pushed another commit with I believe the requested changes, minus the one I still have a question about. |
Just pushed another commit with additional requested changes. |
i think this is ready for merge if you can rebase! Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of it looks good 👍
} | ||
|
||
func (p *AMIProvider) getSSMQuery(ctx context.Context, constraints *v1alpha1.Constraints, instanceType cloudprovider.InstanceType, version string) string { | ||
ssmQuery := p.getAL2Alias(version, instanceType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add documentation changes as part of this PR? We should call out that you will be defaulted to the EKS-optimized AMI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is optional - we can have this be a separate PR too. Just want to make sure we include it before we do another release :)
@@ -220,10 +232,52 @@ func sortedKeys(m map[string]string) []string { | |||
return keys | |||
} | |||
|
|||
// getUserData returns the exact same string for equivalent input, | |||
func (p *LaunchTemplateProvider) getBottleRocketUserData(ctx context.Context, constraints *v1alpha1.Constraints, additionalLabels map[string]string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also support maxPods as that was functionality added in this PR.
For BR, it should be settings.kubernetes.max-pods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a follow-up change too :)
…into bottlerocket
Rebased. |
Ah looks like CI failed due to codegen. Should be fixed if you run |
Just pushed another commit with that - sorry about that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Thanks for working on this! 🚀
What's the plan to get this documented? |
1. Issue, if available:
#923
2. Description of changes:
Adds initial support for amiFamily, allowing Karpenter to choose bottlerocket AMIs in additional to EKS Optimized, without a custom launch template.
3. How was this change tested?
Tested manually in POC cluster.
4. Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.