Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding initial support for bottlerocket without custom launch template #1110

Merged
merged 15 commits into from
Feb 8, 2022

Conversation

rayterrill
Copy link
Contributor

1. Issue, if available:
#923

2. Description of changes:
Adds initial support for amiFamily, allowing Karpenter to choose bottlerocket AMIs in additional to EKS Optimized, without a custom launch template.

3. How was this change tested?
Tested manually in POC cluster.

4. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@netlify
Copy link

netlify bot commented Jan 9, 2022

✔️ Deploy Preview for karpenter-docs-prod canceled.

🔨 Explore the source changes: a89b7be

🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/6202968d1c529800075f1ba9

@rayterrill
Copy link
Contributor Author

Go easy on me - new to Karpenter and still a novice at Go itself - but eager to learn + grow! :)

Copy link
Contributor

@ellistarn ellistarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Some minor code nits, but I think you nailed it in concept. I want to make sure @akestner and @suket22 are feeling good about this API change. We can always refine it since this is an alpha API, but it's good to be careful regardless.

I think we might want use defaulting logic to set eksoptimized in the API when the provisioner is created. This has minor repercussions to existing provisioners, so perhaps we can do in memory defaulting (something we've talked about it the past) to maintain backwards compatibility. Happy to chat more on this on slack.

@@ -25,6 +25,7 @@ var (
ArchitectureAmd64 = "amd64"
ArchitectureArm64 = "arm64"
OperatingSystemLinux = "linux"
BottleRocket = "bottlerocket"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to keep AWS specific concepts isolated to the cloudprovider/aws package, check out the APIs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be OperatingSystemBottleRocket to match the linux var

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this to cloudprovider/aws/apis/v1alpha1/register.go, renamed to OperatingSystemBottleRocket.

pkg/cloudprovider/aws/ami.go Outdated Show resolved Hide resolved
version, err := p.kubeServerVersion(ctx)
if err != nil {
return nil, fmt.Errorf("kube server version, %w", err)
}
// Separate instance types by unique queries
amiQueries := map[string][]cloudprovider.InstanceType{}
for _, instanceType := range instanceTypes {
query := p.getSSMQuery(instanceType, version)
query := p.getSSMQuery(ctx, constraints.AMIFamily, instanceType, version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider passing the full constraints object through.

pkg/cloudprovider/aws/ami.go Outdated Show resolved Hide resolved
arch = "arm64"
}

if amiFamily == v1alpha5.BottleRocket {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on doing a switch statement here for all ami families? We may need this for flatcar, ubuntu, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellistarn we actually talked about this today at work.

Is there a concept of "release channels" for these OS versions (bottlerocket and amazon linux)? I see AL2's is called "recommended" - does that imply there's a latest that's newer/less tested? Similarly for bottlerocket - does latest imply there's a recommended?

We were also wondering if it might be beneficial to allow users to specify their own SSM query, so they can control their own "release cadence" outside of AWS' as needed (we might want to do something like let our DEV stuff pull latest, but "pin" PROD to something like "stable" that's more battle tested.

pkg/cloudprovider/aws/instancetype.go Outdated Show resolved Hide resolved
pkg/cloudprovider/types.go Outdated Show resolved Hide resolved
pkg/cloudprovider/aws/launchtemplate.go Outdated Show resolved Hide resolved
pkg/cloudprovider/aws/launchtemplate.go Outdated Show resolved Hide resolved
pkg/cloudprovider/aws/launchtemplate.go Outdated Show resolved Hide resolved
@rayterrill
Copy link
Contributor Author

I'll push another commit today with these changes.

@rayterrill
Copy link
Contributor Author

Just pushed a big commit that hopefully addresses all the feedback. Please let me know what else I can do - really enjoying working on this! :) Cheers!

@ellistarn
Copy link
Contributor

Really nice work so far @rayterrill . A couple of comments and optional suggestions.

@rayterrill rayterrill requested a review from ellistarn January 20, 2022 16:52
Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! Just a few comments and hopefully we can get this merged very soon!! :)

pkg/cloudprovider/aws/ami.go Outdated Show resolved Hide resolved
if *constraints.AMIFamily == v1alpha1.OperatingSystemBottleRocket {
return fmt.Sprintf("/aws/service/bottlerocket/aws-k8s-%s/%s/latest/image_id", version, arch)
} else if *constraints.AMIFamily == v1alpha1.OperatingSystemEKSOptimized {
return fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2%s/recommended/image_id", version, amiSuffix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could put this format string into a const since it's duplicated 3 times. Wouldn't want to accidentally typo one of those :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm following @bwagner5 - can you give me an example? Sorry mate still learning Go so maybe I'm just missing something simple - not sure how a const would work with interpolated values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries!

I had meant something like this:

const (
  AL2AliasFormat = "/aws/service/eks/optimized-ami/%s/amazon-linux-2%s/recommended/image_id"
)

...

fmt.Sprintf(AL2AliasFormat, version, amiSuffix)

However, after looking at this again, a better idea is to use a function. The constant alias format string may be confusing since it takes two parameters. Breaking them into distinct functions will hide the complexity of building up the alias, make it easier to test separately, and there's potentially more validation we could do in these funcs in the future.

Can you do something like this?

func (p *AMIProvider) getAL2Alias(version string, instanceType cloudprovider.InstanceType) string {
	amiSuffix := ""
	if !instanceType.NvidiaGPUs().IsZero() || !instanceType.AWSNeurons().IsZero() {
		amiSuffix = "-gpu"
	} else if instanceType.Architecture() == v1alpha5.ArchitectureArm64 {
		amiSuffix = fmt.Sprintf("-%s", instanceType.Architecture())
	}
	return fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2%s/recommended/image_id", version, amiSuffix)
}

func (p *AMIProvider) getBottlerocketAlias(version string, instanceType cloudprovider.InstanceType) string {
	arch := "x86_64"
	if instanceType.Architecture() == v1alpha5.ArchitectureArm64 {
		arch = instanceType.Architecture()
	}
	return fmt.Sprintf("/aws/service/bottlerocket/aws-k8s-%s/%s/latest/image_id", version, arch)
}

This will encapsulate the aliasing logic (like knowing the arch determines the amiSuffix) into a function that returns the properly formatted alias.

pkg/cloudprovider/aws/apis/v1alpha1/register.go Outdated Show resolved Hide resolved
pkg/cloudprovider/aws/launchtemplate.go Outdated Show resolved Hide resolved
@rayterrill
Copy link
Contributor Author

Just pushed another commit with I believe the requested changes, minus the one I still have a question about.

@rayterrill
Copy link
Contributor Author

Just pushed another commit with additional requested changes.

@bwagner5
Copy link
Contributor

bwagner5 commented Feb 3, 2022

i think this is ready for merge if you can rebase! Thanks!

Copy link
Contributor

@suket22 suket22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of it looks good 👍

}

func (p *AMIProvider) getSSMQuery(ctx context.Context, constraints *v1alpha1.Constraints, instanceType cloudprovider.InstanceType, version string) string {
ssmQuery := p.getAL2Alias(version, instanceType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add documentation changes as part of this PR? We should call out that you will be defaulted to the EKS-optimized AMI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional - we can have this be a separate PR too. Just want to make sure we include it before we do another release :)

@@ -220,10 +232,52 @@ func sortedKeys(m map[string]string) []string {
return keys
}

// getUserData returns the exact same string for equivalent input,
func (p *LaunchTemplateProvider) getBottleRocketUserData(ctx context.Context, constraints *v1alpha1.Constraints, additionalLabels map[string]string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also support maxPods as that was functionality added in this PR.

For BR, it should be settings.kubernetes.max-pods

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a follow-up change too :)

@rayterrill
Copy link
Contributor Author

Rebased.

@bwagner5
Copy link
Contributor

bwagner5 commented Feb 8, 2022

Ah looks like CI failed due to codegen. Should be fixed if you run make codegen and commit the aws cloudprovider's zz_generated_deepcopy.go which will include AMIFamily now.

@rayterrill
Copy link
Contributor Author

Just pushed another commit with that - sorry about that.

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Thanks for working on this! 🚀

@bwagner5 bwagner5 merged commit cd1f88d into aws:main Feb 8, 2022
@akestner
Copy link
Contributor

What's the plan to get this documented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants