-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm tests failing with v0.71.0 operator #1574
Comments
I'm unsure what's going on here, strange this didn't fail for the operator tests in this repo. From your second test (cloning main) it seems that a step is skipped (https://github.com/open-telemetry/opentelemetry-helm-charts/actions/runs/4402557919/jobs/7710082951#step:13:761). This test should be running this chart which should have the desired It's strange that 0.71.0 test which doesn't include my recent changes, fail as well, it seems scaledown doesn't happen in time despite stabilization window being set to 15 seconds. I wonder if there's something obvious I'm missing here when I added to these e2e tests. (cc @iblancasa @pavolloffay) |
When cloning the 0.71 release tag
When cloning
|
@Allex1 I created a PR which adds nil checks and reduces the stabilization window. This should at least remove the segfault for your test based off of main. I also reduced the stabilization window in case for some reason it is too high. Can you run the helm tests again based off my PR's branch? Hopefully the reason for failure will become more obvious |
@moh-osman3 currently fails https://github.com/open-telemetry/opentelemetry-helm-charts/actions/runs/4414191149/jobs/7735581591
I'm also running locally |
Hmm since it's failing on this step, that means it must have already correctly validated the metrics. This means metrics have been updated properly, but behavior is not being updated properly which is a bug. I'm wondering what's different in how the helm repo sets up the environment for the operator e2e tests? i.e. why is this bug is found in this test but not the e2e tests the operator runs? Behavior also seems to update properly in my own remote cluster testing |
I reproduced this locally in the helm tests and now realize that there is not a bug with updates of behavior (failure for behavior updates is because helm test is running operator I am unsure why the scaledown setting is being ignored but I noticed that the helm repo is using |
To fix this I removed the behavior update from the e2e tests and I added a timeout of @pavolloffay @iblancasa I'd appreciate some guidance on the best way forward for this issue, is a timeout okay? Do you have any idea for why this scaledown window is not being used in |
@moh-osman3 Thanks, started the CI again. We can probably test against kubernetes 1.25 as well in the helm repo. |
@Allex1 Still seems to fail, which is strange because the autoscale test passes locally for me. I'm really unsure what's going on |
yeah, not sure either, fails locally as well with the helm tests |
Since the problem is in the Helm project, it would make more sense to adjust in their repository: $ kuttl test --help | grep -a timeout
--timeout int The timeout to use as default for TestSuite configuration. (default 30) Regarding the problem: I'll try to reproduce it locally. |
@iblancasa the timeout arg can be passed from the testsuite config as well which we already did. Strangely enough my pr that failed CI passes just fine locally ran via act after some teaks
|
I know. What I wanted to avoid is to do the modifications in this repository. I don't understand why it takes so long to pass the test in your repository. I just ran them locally and all the tests passed:
$ git rev-parse HEAD
10918c501c0f6886701058afee719b29e5bdb276
$ cd opentelemetry-operator/ && git rev-parse HEAD
0e39ee77693146e0924da3ca474a0fe14dc30b3a
$ k version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.26.1
Kustomize Version: v4.5.7
Server Version: v1.25.3 |
locally it works for me as well with opentelemetry-operator v0.72.0 tag :
|
Hi all,
Trying to upgrade the helm stack to use operator v0.71.0 and tests are failing in a non obvious way.
cloning the 0.71 release tag I get https://github.com/open-telemetry/opentelemetry-helm-charts/actions/runs/4396632296/jobs/7699210715#step:13:991
cloning main (as I saw #1553 is not included in the release) I get https://github.com/open-telemetry/opentelemetry-helm-charts/actions/runs/4402557919/jobs/7710082951
I'm not exactly familiar with the operator tests so I could use some help. @moh-osman3 can you have a look ?
Thanks
The text was updated successfully, but these errors were encountered: