Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize when a resource has been deleted while the operator waits #672

Conversation

jpkrohling
Copy link
Contributor

@jpkrohling jpkrohling commented Sep 24, 2019

Fixes #670 by making the operator recognize when a resource existed before but ceases to exist while waiting for said resource to become available.

Signed-off-by: Juraci Paixão Kröhling [email protected]

@jpkrohling
Copy link
Contributor Author

@kevinearls we had a similar problem in e2e tests before. Unfortunately, there are no unit tests that can be made to check this new code, but perhaps you could adapt the test that was failing before? It's actually easy to reproduce. This is a CR that will trigger the condition where the operator will wait indefinitely:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: fails
spec:
  allInOne:
    options:
      invalidOption: "invalid option"

Apply this, wait a few milliseconds and delete the CR. Before this PR, the reconciliation would fail after 5 minutes with:

ERRO[0317] failed to apply the changes                   error="timed out waiting for the condition" execution="2019-09-24 09:16:52.72172503 +0000 UTC" instance=fails namespace=default

With this PR, it fails far earlier with:

WARN[0012] Deployment has been removed.                  name=fails namespace=default
ERRO[0012] failed to apply the changes                   error="deployment has been removed" execution="2019-09-24 09:22:28.740323219 +0000 UTC" instance=fails namespace=default

@jkandasa, I published an image that can be used for manual testing this change: quay.io/jpkroehling/jaeger-operator:670-Recognize-when-deployment-has-been-deleted.

@kevinearls
Copy link
Contributor

@jpkrohling sorry for the delay, I will be working on this today.

@kevinearls
Copy link
Contributor

@jpkrohling Bizarrely, I cannot reproduce this now. I may be overlooking something, or it may just be that whatever timing issue I was hitting before is not occurring now. I will try again later when I have time, but it might be best to wait till @jkandasa is back.

@kevinearls
Copy link
Contributor

@jpkrohling I was able to confirm that your CR triggers the condition, and that this PR fixes it. So LGTM.

Copy link
Contributor

@kevinearls kevinearls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in my previous comment, LGTM

@jpkrohling
Copy link
Contributor Author

Thanks! Do you think it makes sense to create an e2e test for this? If so, could you create an issue for that? I'm merging this in the meantime.

@jpkrohling jpkrohling merged commit 7b6c393 into jaegertracing:master Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

jaeger operator service is not available for a couple of minutes
2 participants