-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Creating Access Log Policies #430
Conversation
} | ||
|
||
func (t *accessLogSubscriptionModelBuildTask) run(ctx context.Context) error { | ||
sourceType := model.ServiceSourceType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general comment, since we are doing a brand new e2e code path for accessLogPolicy
, can we don't follow the old side-effect style code? I think old style is a little bit hard to trace the data flow (tho we already got used to it)
Can we try to use stateless functions as much as we can?
for example:
accessLogSubscriptionModelBuildTask
could don't have member variables stack
, accessLogPolicy
, accessLogSubscription
. Instead, use function input params and return vaule to represent them:
func (t *accessLogSubscriptionModelBuildTask) run(ctx context.Context, accessLogPolicy *anv1alpha1.AccessLogPolicy ) *model.AccessLogSubscription, error
(And I think either stack
or accessLogSubscription
just need one, don't need both)
Another example is (s *accessLogSubscriptionSynthesizer) Synthesize()
, can we removed member variable stack
of accessLogSubscriptionSynthesizer and put it in the input parameter?
func (s *accessLogSubscriptionSynthesizer) Synthesize(ctx context.Context, accessLogSubscription *model.AccessLogSubscription) error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree a functional code path is ideal, I think we are at a point now where deviation from the existing style would require a much more substantial refactor of the codebase. I would rather keep things consistent and have predictable side-effects, than have multiple different controllers with wildly different styles of code
Perhaps we should create a refactoring ticket to migrate over if the existing code style is causing churn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, make sense to do code style refactoring in a seperate PR, I can create a issue for it
}, nil | ||
} | ||
|
||
func getServiceNetworkWithName(ctx context.Context, vpcLatticeSess services.Lattice, name string) (*vpclattice.ServiceNetworkSummary, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does FindServiceNetwork() function do the same thing with this function? can you reuse it?
func (d *defaultLattice) FindServiceNetwork(ctx context.Context, name string, optionalAccountId string) (*ServiceNetworkInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, wasn't aware it was an option! Will fix next commit
} | ||
|
||
func getServiceWithName(ctx context.Context, vpcLatticeSess services.Lattice, name string) (*vpclattice.ServiceSummary, error) { | ||
serviceSummaries, err := vpcLatticeSess.ListServicesAsList(ctx, &vpclattice.ListServicesInput{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a FindService() function, same with that one?
func (d *defaultLattice) FindService(ctx context.Context, nameProvider LatticeServiceNameProvider) (*vpclattice.ServiceSummary, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above!
vpcLatticeSess := m.cloud.Lattice() | ||
|
||
var resourceIdentifier string | ||
if accessLogSubscription.Spec.SourceType == lattice.ServiceNetworkSourceType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can change to a switch case here?
switch accessLogSubscription.Spec.SourceType {
case lattice.ServiceNetworkSourceType:
.......
case lattice.ServiceSourceType:
.......
default:
return nil, services.NewInvalidError("Invalid source type")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done next commit!
Note that I'll be using fmt.Errorf rather than NewInvalidError. This would be a case where the controller is improperly handling something, rather than the user input being invalid.
@@ -133,16 +140,24 @@ func (r *accessLogPolicyReconciler) reconcileUpsert(ctx context.Context, alp *an | |||
alp.Spec.TargetRef.Kind, alp.Spec.TargetRef.Name) | |||
err := r.updateAccessLogPolicyStatus(ctx, alp, gwv1alpha2.PolicyReasonTargetNotFound) | |||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we have this behavior: If the targetRef does not exist, the controller will enter a reconciliation loop until the targetRef exists or the Policy is updated/deleted
One thing I am not sure is we return retryErr==nil
with policy status Accepted==false
with reason TargetNotFound
, does the controller can keep the reconciliation loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I worded this poorly. Corrected it to
- If the VPC Lattice resource (Service or Service Network) corresponding to the targetRef does not exist, the controller will enter a reconciliation loop until the resource exists or the Policy is updated/deleted
- If the targetRef itself does not exist, the status of the Access Log Policy is set to TargetNotFound
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for clarifying!
For the behavior: If the targetRef itself does not exist, the status of the Access Log Policy is set to TargetNotFound
If the AccessLogPolicy k8s resource is created first, then later, corresponding targetRefObj (Gateway or HTTPRoute or GRPCRoute) is created, do we have a sort of gwEventHandler.MapToAccessLogPolicy()
(enqueueImpactedAccessLogPolicy()
) to make the controller do the (r *accessLogPolicyReconciler) reconcile()
again? Assume you will do it in a separate task [EKS][Access Logs] Handle Gateway and Route ?
(I actually think that behavior: If the targetRef k8s obj does not exist, the controller will enter a reconciliation loop until the targetRef exists or the Policy is updated/deleted
is good and can keep thing simple...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the case of reverse order creation, we will need an event handler for the other resources to deal with it. The use of TargetRefNotFound is standard practice for Policy CRDs according to Gateway API docs https://gateway-api.sigs.k8s.io/geps/gep-713/#on-policy-objects
The recon loop will handle the case of when the targetRef was created first but the controller didn't finish creating VPC Lattice resources
Do you have a validation logic to reject (fast fail) AccessLogPolicy that it's targetRef.Kind != Gateway or HTTPRoute or GRPCRoute (also need to check targetRef.Group)? Similar to that one:
(also, any chance to re-use |
Good callout, I'll add logic to mark the status of the resource as Invalid if the Kind or Group is unsupported next commit. And I haven't seen a need for the policyToTargetRefObj function as of yet |
@@ -26,14 +26,18 @@ import ( | |||
"k8s.io/apimachinery/pkg/types" | |||
"k8s.io/client-go/tools/record" | |||
ctrl "sigs.k8s.io/controller-runtime" | |||
builder2 "sigs.k8s.io/controller-runtime/pkg/builder" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be like pkg_builder considering recent naming scheme guidance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix next commit!
gateway := &gwv1beta1.Gateway{} | ||
err = r.client.Get(ctx, targetRefNamespacedName, gateway) | ||
gw := &gwv1beta1.Gateway{} | ||
err = r.client.Get(ctx, targetRefNamespacedName, gw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We assume here that error is "not found" only, is that true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, there are other errors that could come up here. Fixed next commit
r.log.Debugw("successfully deployed model", | ||
"stack", stack.StackID().Name+":"+stack.StackID().Namespace, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misuse of "w" logger, should be "f"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed next commit
|
||
func IsConflictError(err error) bool { | ||
conflictErr := &ConflictError{} | ||
return errors.As(err, &conflictErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors.Is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, this should be errors.As, since we care about the type of the error, rather than the specific contents of the error. If I simply change this function to
func IsNotFoundError(err error) bool {
nfErr := NotFoundError{}
return errors.Is(err, &nfErr)
}
then tests start failing because the contents of the error don't match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 approaches with std errors package.
- Create a sentinel error, that can be asserted by "Is" and can be used when you dont need to access details for error handling. For example oserrors: https://cs.opensource.google/go/go/+/refs/tags/go1.21.3:src/internal/oserror/errors.go
var NotFoundError = errors.New("resource not found")
func NewNotFoundError(name string) error {
return fmt.Errorf("%s: %w", name, NotFoundError)
}
// error check now as simple as
err := NewNotFoundError("lattice")
if errors.Is(err, NotFoundError) {
println(err) // will print name and error
}
- You need error details when handling error. Then you create struct and implement error interface, but then usage changes too.
type NotFoundError sturct {
name string
}
func (e *NotFoundError) Error() string {
return fmt.Sprintf("%s: resource not found", e.name)
}
err := &NotFoundError{"lattice"}
// now you handle error with passing reference
notFoundErr := &NotFoundError
if errors.As(err, notFoundErr) {
println("can use name now: " + notFoundErr.name)
}
I see that you use second approach but intention is first approach.
pkg/deploy/stack_deployer.go
Outdated
log gwlog.Logger | ||
k8sClient client.Client | ||
stack core.Stack | ||
accessLogSubscriptionManager lattice.AccessLogSubscriptionManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as there is single manager name can be simplified to manager, I dont see there will be multiple managers here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, fixed next commit
/* | ||
* For Service Network, the name is just the Gateway's name. | ||
* For Service, the name is Route's name, followed by hyphen (-), then the Route's namespace. | ||
*/ | ||
sourceName := string(t.accessLogPolicy.Spec.TargetRef.Name) | ||
if sourceType == model.ServiceSourceType { | ||
namespace := t.accessLogPolicy.Spec.TargetRef.Namespace | ||
if namespace != nil { | ||
sourceName = fmt.Sprintf("%s-%s", sourceName, string(*namespace)) | ||
} else { | ||
sourceName = fmt.Sprintf("%s-default", sourceName) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few questions:
- When namespace is nil? why it's should be possible
- Naming rules for Gateway and non-Gateway seems common problem, I dont think we should handle it here. There should be single place for this logic in the code base. I have a sense we have something like this already in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Namespace can be nil because it's not required as part of the targetRef spec. If not provided, it should default to the Policy's namespace, rather than "default". Fixed this next commit.
- I found utils.LatticeServiceName, and will move this logic over to the utils package as well, since it will come in handy for other Policy CRDs
"github.com/aws/aws-sdk-go/aws" | ||
"github.com/golang/mock/gomock" | ||
"github.com/stretchr/testify/assert" | ||
v1 "k8s.io/apimachinery/pkg/apis/meta/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apimachineryv1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed next commit, thank you!
"k8s.io/apimachinery/pkg/runtime" | ||
clientgoscheme "k8s.io/client-go/kubernetes/scheme" | ||
testclient "sigs.k8s.io/controller-runtime/pkg/client/fake" | ||
"sigs.k8s.io/gateway-api/apis/v1alpha2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gwv1alpha2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed next commit, thank you!
Status: nil, | ||
} | ||
|
||
stack.AddResource(als) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not be here, please move stack mutation above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed next commit
Will also update the message field in the Policy's Status to include useful information during the error cases next commit |
…lidation for AccessLogPolicy targetRef, changed default namespace to Policy namespace in targetRer, general code improvements
Reran e2e tests with new changes
|
pkg/aws/services/vpclattice.go
Outdated
name string | ||
} | ||
|
||
func NewLatticeServiceNameProvider(name string) LatticeServiceNameProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change name to NewDefaultLatticeServiceNameProvider
? since LatticeServiceNameProvider is a function name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed next commit
Pull Request Test Coverage Report for Build 6512529140
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, looking forward to the rest of Access Log code.
Do you need to send e2etest at last in a separate PR?
Make sure to reply @mikhail-aws 's comments or make code change
httpK8sService *corev1.Service | ||
grpcK8sService *corev1.Service | ||
httpRoute *gwv1beta1.HTTPRoute | ||
grpcRoute *gwv1alpha2.GRPCRoute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new e2e test suite consists of 8 tests and adds about 10% to the total e2e test run time
If you just create one of httpRoute/grpcRoute and create 2 AccessLogSubscriptions on it can it save some time? Or any other way to save e2etest time? (e.g., just test CloudWatchLogGroup type accessLog for gateway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need 1 of each source type and 1 of each destination type to confirm all the use-cases work, otherwise there is risk that our TargetRef identification or DestinationArn validation is broken without us knowing. I'll be adding Delete and Update e2e tests here in upcoming PRs, I think we will be able to squish some use-cases together into single tests to save time once that happens.
Thank you, I have one more commit on the way with the DefaultTags logic from #428 e2e tests are in this PR @mikhail-aws comments are all addressed in the above commits |
…ging each error + test cleanup
What type of PR is this?
Feature
Which issue does this PR fix:
#200 partially
What does this PR do / Why do we need it:
gateway.networking.k8s.io
, the Policy status is set to InvalidGateway
,HTTPRoute
, orGRPCRoute
, the Policy status is set to InvalidNote: There was discussion about adding annotations to Access Log Policies in this thread. I intend to add this logic in the upcoming Delete logic, as this PR is large enough as is and the Delete logic will be able to make use of the annotations.
If an issue # is not available please add repro steps and logs from aws-gateway-controller showing the issue:
Testing done on this change:
Automation added to e2e:
Will this PR introduce any new dependencies?:
E2E tests now rely on:
Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No
Does this PR introduce any user-facing change?:
Yes, but this isn't ready for release yet since you can't update or delete these policies. Release note will be added as part of the final PR of this feature.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.