Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update target group info using TargetGroupPolicy #357

Merged
merged 4 commits into from
Aug 27, 2023
Merged

Conversation

solmonk
Copy link
Contributor

@solmonk solmonk commented Aug 25, 2023

What type of PR is this?

feature

Which issue does this PR fix:

#304 #315 (partial) #69

What does this PR do / Why do we need it:

Now supporting configuring target group protocols and health checks by attaching TargetGroupPolicy CRD to k8s Service.

This PR includes breaking change on Lattice target group naming scheme. Now TG names on VPC Lattice include protocol and protocol version - this is necessary for avoiding conflicts and supporting a few service export scenarios.

I also have increased the name size to allow longer names.

Whenever a TargetGroupPolicy is applied for existing target group, if protocol / protocolVersion is changed:

  • New target group will be created by TG synthesizer, since now the TG name is different.
  • Once new target group is active, it will be picked up by Rule synthesizer
  • TG synthesizer will clean up old TG

Since we have this order it should be happening quickly without downtime.

If only HC is changed, none of these happens, it will be just calling UpdateTargetGroup API.

WIP things:

  • unit tests
  • GRPC ServiceExport support (will revisit after GRPC PR merge)

If an issue # is not available please add repro steps and logs from aws-gateway-controller showing the issue:

Testing done on this change:
Did manual tests, unit test changes pending

Automation added to e2e:

Will this PR introduce any new dependencies?:

Will this break upgrades or downgrades. Has updating a running cluster been tested?:

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@coveralls
Copy link

coveralls commented Aug 25, 2023

Pull Request Test Coverage Report for Build 5989893375

  • 64 of 158 (40.51%) changed or added relevant lines in 4 files are covered.
  • 2 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.08%) to 37.506%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/latticestore/latticestore.go 6 12 50.0%
pkg/deploy/lattice/target_group_manager.go 28 60 46.67%
pkg/gateway/model_build_targetgroup.go 25 81 30.86%
Files with Coverage Reduction New Missed Lines %
pkg/deploy/lattice/target_group_manager.go 1 80.85%
pkg/latticestore/latticestore.go 1 91.88%
Totals Coverage Status
Change from base Build 5981835075: -0.08%
Covered Lines: 3949
Relevant Lines: 10529

💛 - Coveralls

Copy link
Contributor

@erikfuller erikfuller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just questions from me. Biggest one is around whether or not we should be changing default target group names. Maybe there's already been discussion on one of the issues that I missed?

@@ -51,14 +51,14 @@ spec:
healthyThresholdCount:
description: The number of consecutive successful health checks
required before considering an unhealthy target healthy.
format: int32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind changing these? Just curious.

targetGroup.Spec.Config.K8SHTTPRouteName, config.VpcID)
} else {
tgName = targetGroup.Spec.Name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we certain we want to change the tg name even for the short version? If I understand it right, before it was targetGroup.Spec.Name, now it's <targetGroup.Spec.Name>-<protocol>-<protocolVersion>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is intended, its supposed to be a key change to unblock GRPC serviceexport use case.

Comment on lines +311 to +313
utils.Truncate(defaultName, 70),
utils.Truncate(routeName, 20),
utils.Truncate(vpcId, 21),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did we decide on these numbers? I get VPC id length, but not the other two. Can we add constants for these to give them some description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add descriptions to this, but this is mostly for just fitting in 128b limit. Wanted to put most spaces for the name while preserving some minimum length for routes

This should be eventually changed to some configurable format, as discussed in #315

Comment on lines 320 to 321
matcher = func(s string) bool {
match := strings.HasPrefix(s, targetGroup.Spec.Name)
if targetGroup.Spec.Config.ProtocolVersion == vpclattice.TargetGroupProtocolVersionGrpc {
return match && strings.HasSuffix(s, vpclattice.TargetGroupProtocolVersionGrpc)
}
return match
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be looking for dashes as well since they're kind of our delimiter?

Can what's missing in the middle be used to specify two different services that are both imported? Will we ever match more that one target group with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this comment now I think we should match smarter than this, let me change this part

for _, r := range resp {
if aws.StringValue(r.Name) == targetGroup {
glog.V(6).Info("targetgroup ", targetGroup, " already exists with arn ", *r.Arn, "\n")
if matcher(*r.Name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

further from above, should we be validating we only have at most one match?

// We have finished rule reconciliation at this point.
// If a target group under HTTPRoute does not have any service, it is stale.
isUsed := t.isTargetGroupUsedByaHTTPRoute(ctx, tgName, httpRoute) &&
len(sdkTG.getTargetGroupOutput.ServiceArns) > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a hard time understanding why lines 270-288 matter, looks like we're just using it for logging or is there some side effect happening I can't see?

Maybe don't worry about it, this super long function could really use some refactoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the function that removes dangling/stale target groups.

@@ -371,14 +374,29 @@ func Test_SynthesizeSDKTargetGroups(t *testing.T) {
wantDataStoreError: nil,
wantDataStoreStatus: "",
},
{
name: "Delete SDK TargetGroup since it is dangling with no service associated",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this case happen under normal circumstances? Want to make sure we're not being over eager with our deletes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens when changing target group protocol - it should create tg-https-http1 and remove tg-http-http1 for example. Due to how we have detected inUse target group before, I need to make a change in behavior here (also related to the comment above)

for _, policy := range policyList.Items {
targetRef := policy.Spec.TargetRef
if targetRef == nil {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log or is this expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can ignore this, my ground rule here is to only care about TargetGroupPolicy that is relevant.

return latticemodel.TargetGroupSpec{}, err
}
protocol := "HTTP"
protocolVersion := vpclattice.TargetGroupProtocolVersionHttp1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this from the spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes those are default values

protocol = *tgp.Spec.Protocol
}
if tgp.Spec.ProtocolVersion != nil {
protocolVersion = *tgp.Spec.ProtocolVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do any translation on these for the API, like ToUpper() or anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can double check but I don't think so. we already have validations on YAML file.

@@ -372,6 +391,23 @@ func (t *latticeServiceModelBuildTask) buildTargetGroupSpec(ctx context.Context,

tgName := latticestore.TargetGroupName(string(httpBackendRef.Name()), namespace)

tgp, err := getAttachedTargetGroupPolicy(ctx, client, string(httpBackendRef.Name()), namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following code block looks identical to line 198-213

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is right now, but will be quite different with GRPC serviceexport follow-up change

pkg/model/lattice/targetgroup.go Show resolved Hide resolved
@@ -420,6 +458,57 @@ func (t *latticeServiceModelBuildTask) buildTargetGroupName(_ context.Context, b
}
}

func getAttachedTargetGroupPolicy(ctx context.Context, k8sClient client.Client, svcName, svcNamespace string) (*v1alpha1.TargetGroupPolicy, error) {
policyList := &v1alpha1.TargetGroupPolicyList{}
err := k8sClient.List(ctx, policyList)
Copy link
Contributor

@zijun726911 zijun726911 Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance to use field selector for List() here to avoid for loop?

@solmonk solmonk merged commit 242241f into aws:main Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants