Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended.[k8s.io] Probing container should *not* be restarted with a /healthz http liveness probe [Conformance] #12072

Closed
bparees opened this issue Nov 30, 2016 · 18 comments
Assignees
Labels
component/kubernetes kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P1

Comments

@bparees
Copy link
Contributor

bparees commented Nov 30, 2016

• Failure [138.845 seconds]
[k8s.io] Probing container
/data/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:793
  should *not* be restarted with a /healthz http liveness probe [Conformance] [It]
  /data/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/common/container_probe.go:233

  Nov 29 23:16:31.438: pod e2e-tests-container-probe-dk0xc/liveness-http - expected number of restarts: 0, found restarts: 1

  /data/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/common/container_probe.go:373

https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_conformance/8991/

@derekwaynecarr
Copy link
Member

Same as kubernetes/kubernetes#28084

@smarterclayton
Copy link
Contributor

I disabled this on origin_gce to due to flaking

@gabemontero
Copy link
Contributor

It got noted again in upstream via kubernetes/kubernetes#30714

Saw it on our end again in #13577

@bparees
Copy link
Contributor Author

bparees commented Aug 31, 2017

@bparees
Copy link
Contributor Author

bparees commented Aug 31, 2017

referenced issue above has been closed so i am raising the priority of this issue.

@smarterclayton
Copy link
Contributor

smarterclayton commented Oct 10, 2017 via email

@sjenning
Copy link
Contributor

@ravisantoshgudimetla PTAL

@sjenning sjenning removed their assignment Jan 22, 2018
@ravisantoshgudimetla
Copy link
Contributor

I have been trying to reproduce this locally but without luck. The main idea of this test is to check that a good container won't restart. Looking at the code and the history of this issue, following are my observations.

Since the goal this test is to check that a good container won't restart based on liveness probe, would it make sense to add a liveness command that checks for the existence of a particular directory or something else instead of a HTTP GET.

@stevekuznetsov
Copy link
Contributor

@ravisantoshgudimetla do you feel like you can deliver the change to the liveness probe to move away from HTTP?

@ravisantoshgudimetla
Copy link
Contributor

ravisantoshgudimetla commented Feb 1, 2018

Flaked again-https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended_conformance_gce/15413/console

@stevekuznetsov - I created an upstream issue yesterday. I think it would be better to delete this test. I did not get any feedback on it yet but I will create a PR and see if the upstream is ok with it.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Feb 28, 2018
Automatic merge from submit-queue (batch tested with PRs 60342, 60505, 59218, 52900, 60486). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Increase failureThresholds for failing HTTP liveness test

**What this PR does / why we need it**:
Removes test from e2e which relies on HTTP liveness as a measure to tell if the container is good or bad. While this is not a bad idea, we cannot rely on this test as HTTP liveness relies on network/infrastructure etc on which sometimes we have no control over. While increasing the timeout may be an option it may not be ideal for all cloud providers/type of hardware etc.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #59150 

**Special notes for your reviewer**:
I have stated reasons in the issue #59150. We have seen that this test is flaking recently in openshift/origin#12072

**Release note**:

```release-note
NONE
```
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 2, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 2, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubernetes kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P1
Projects
None yet
Development

No branches or pull requests

10 participants