Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readiness gate not being skipped when skipAnalysis is set to true #711

Open
manuel-sanchez opened this issue Oct 14, 2020 · 12 comments
Open

Comments

@manuel-sanchez
Copy link

manuel-sanchez commented Oct 14, 2020

Flagger version
1.2.0
Kubernetes version
v1.16.13-eks-2ba888
Configuration
Canary
Mesh
App Mesh
App Mesh version
0.5.0
What happened
I upgraded to version 1.2.0 and my canaries fail the readiness gate definition ALB Readiness gates I have for them, however I have the skipAnalysis set to true, so I would expect for this to be ignored. This was the behavior it had for version 1.0.0 but seems like something changed between version 1.0.0 and 1.2.0 and it's now being marked as failed. Is this a known issue?
On the other hand, if I set skipAnalysis to false, how can I make sure that the readiness gates would work during canary?
I have and ingressRef on my canary definition pointing to the main Ingress:

ingressRef:
    apiVersion: extensions/v1beta1
    kind: Ingress
    name: {{ .Chart.Name }}-app
@stefanprodan
Copy link
Member

This changed in #695

@manuel-sanchez
Copy link
Author

manuel-sanchez commented Oct 15, 2020

Hi Stefan, thanks for the quick response. So the next question I have is, to which of the services that flagger creates would I have to point the readiness gate for the canary to pass? On my configuration I have it pointing to the main service (not the -canary or -primary one) and the port configured for that service. This configuration works once the app is in primary but it doesn't work on canary so with the mentioned change this never gets promoted and it's rolled back.
Just to add a bit more info, I also tried pointing it to the primary and canary services, both fail

@manuel-sanchez
Copy link
Author

manuel-sanchez commented Oct 18, 2020

One thing I also noticed when using flagger1.2.0 and appmesh is that flagger is failing to create the pods and I see this error in the replicaset:
Error creating: admission webhook "mpod.appmesh.k8s.aws" denied the request: sidecarInject enabled but no matching VirtualNode or VirtualGateway found
But it works correctly with 1.1.0

@stefanprodan
Copy link
Member

Hmm we create the Virtual Nodes before the pods https://github.com/weaveworks/flagger/blob/master/pkg/controller/scheduler.go#L128-L135

Does it fail like that with skip analysis enabled?

@manuel-sanchez
Copy link
Author

manuel-sanchez commented Oct 19, 2020

It's not even creating the pods, so it doesn't get the chance to do canary analysis @stefanprodan I even tried with a clean helm install and the pods do not get created and that's the error on the replica set
On the other hand, is there a way to get around the readiness gate issue? The one reverting the canary when readiness gate for ALB is enabled?

@stefanprodan
Copy link
Member

stefanprodan commented Oct 21, 2020

@manuel-sanchez I'll look into the AppMesh issue next week. As for the skip analysis, I think we need a different flag e.g. skipReadiness to restore the old behaviour.

@manuel-sanchez
Copy link
Author

Thanks @stefanprodan please let me know if there's more info needed for the AppMesh issue. On the other hand, if something like skipReadiness is implemented will it route traffic to canary correctly and analyze it if skip canary is disabled?

@praseedasathaye
Copy link

I just stumbled upon this issue as well. According to the document https://docs.flagger.app/tutorials/appmesh-progressive-delivery, first we deploy the application (create pod) and then create the virtual nodes. Do we need to reverse the order and create virtual nodes before the pod deployment?

ReplicaFailure True admission webhook "mpod.appmesh.k8s.aws" denied the request: sidecarInject enabled but no matching VirtualNode or VirtualGateway found

@stefanprodan
Copy link
Member

stefanprodan commented Dec 17, 2020

AppMesh works fine with Flagger, here is the repo I used at KubeCon Container Day https://github.com/stefanprodan/gitops-appmesh

@praseedasathaye
Copy link

Thanks will look into this. For now I reaed the virtual nodes, then enabled the sidecar and then restarted my deployment.

@manuel-sanchez
Copy link
Author

Hi @stefanprodan any news on the readiness gate? I completely forgot about this but sooner or later I'm going to have to upgrade

@manuel-sanchez
Copy link
Author

manuel-sanchez commented Jan 23, 2021

Hi, for anyone encountering this issue:
#711 (comment)
Make sure to name EVERY resource the same, it might partially work, but won't fully work if there's a mismatch between names
I also removed the release tag from the deployment object:

    metadata:
      labels:
        release: {{ .Release.Name }}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants