Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operators opentelemetry-operator (0.37.1) #402

Conversation

rkukura
Copy link
Contributor

@rkukura rkukura commented Nov 1, 2021

Signed-off-by: Robert Kukura [email protected]

Thanks submitting your Operator. Please check below list before you create your Pull Request.

New Submissions

Updates to existing Operators

  • Did you create a ci.yaml file according to the update instructions?
  • Is your new CSV pointing to the previous version with the replaces property if you chose replaces-mode via the updateGraph property in ci.yaml?
  • Is your new CSV referenced in the appropriate channel defined in the package.yaml or annotations.yaml ?
  • Have you tested an update to your Operator when deployed via OLM?
  • Is your submission signed?

Your submission should not

  • Modify more than one operator
  • Modify an Operator you don't own
  • Rename an operator - please remove and add with a different name instead
  • Modify any files outside the above mentioned folders
  • Contain more than one commit. Please squash your commits.

Operator Description must contain (in order)

  1. Description about the managed Application and where to find more information
  2. Features and capabilities of your Operator and how to use it
  3. Any manual steps about potential pre-requisites for using your Operator

Operator Metadata should contain

  • Human readable name and 1-liner description about your Operator
  • Valid category name1
  • One of the pre-defined capability levels2
  • Links to the maintainer, source code and documentation
  • Example templates for all Custom Resource Definitions intended to be used
  • A quadratic logo

Remember that you can preview your CSV here.

--

1 If you feel your Operator does not fit any of the pre-defined categories, file an issue against this repo and explain your need

2 For more information see here

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 1, 2021
@openshift-ci
Copy link

openshift-ci bot commented Nov 1, 2021

Hi @rkukura. Thanks for your PR.

I'm waiting for a redhat-openshift-ecosystem member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 1, 2021

@jpkrohling , please approve as you are original reviewer(s).
Please note, that you can more contributors to ci.yaml. More info here.

@github-actions github-actions bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. package-validated and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 1, 2021
@framework-automation
Copy link
Collaborator

/merge possible

@rkukura
Copy link
Contributor Author

rkukura commented Nov 1, 2021

/retest

@mvalarh
Copy link
Contributor

mvalarh commented Nov 1, 2021

/test 4.9-deploy-operator-on-openshift

@openshift-ci
Copy link

openshift-ci bot commented Nov 1, 2021

@rkukura: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.9-deploy-operator-on-openshift 39e22ab link true /test 4.9-deploy-operator-on-openshift

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@rkukura
Copy link
Contributor Author

rkukura commented Nov 1, 2021

@mvalarh - I see you reran the 4.9 test and it failed again with something like:

time="2021-11-01T14:32:49Z" level=error msg="UpdateStatus - error while setting CatalogSource status" error="Operation cannot be fulfilled on catalogsources.operators.coreos.com \"test-operators-ocs\": the object has been modified; please apply your changes to the latest version and try again" id=ggvmG source=test-operators-ocs

This doesn't mean much to me. Do you have any idea what the issue might be?

@J0zi
Copy link
Contributor

J0zi commented Nov 2, 2021

@mvalarh - I see you reran the 4.9 test and it failed again with something like:

time="2021-11-01T14:32:49Z" level=error msg="UpdateStatus - error while setting CatalogSource status" error="Operation cannot be fulfilled on catalogsources.operators.coreos.com \"test-operators-ocs\": the object has been modified; please apply your changes to the latest version and try again" id=ggvmG source=test-operators-ocs

This doesn't mean much to me. Do you have any idea what the issue might be?

This is a normal behavior, it was applied later.
@rkukura Did you tested it on 4.9 on your end?

@J0zi
Copy link
Contributor

J0zi commented Nov 2, 2021

I see operator ready and healty

    - Catalog source is up and READY            [OK]
    - Operatorgroup is present                  [OK]
    - Subscribed                                [OK]
    - Operator is in packagemanifests           [OK]
    - Operator startup                          [OK]
    - Operator stayed healthy after startup     [OK]

It failed during deletion of resources. So folks can install your operator, but are unable to uninstall it on 4.9.
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/redhat-openshift-ecosystem_community-operators-prod/402/pull-ci-redhat-openshift-ecosystem-community-operators-prod-main-4.9-deploy-operator-on-openshift/1455200540676853760#1:build-log.txt%3A5764

@rkukura
Copy link
Contributor Author

rkukura commented Nov 3, 2021

@J0zi - It looks like deleting this operator fails on OpenShift 4.9 since at least 0.33.0 (the version prior to this PR). With either 0.33.0 or 0.37.1, the operator appears to delete, but the opentelemetry-operator-controller-manager Pod, Deployment, and ReplicaSet are left in place and cannot be manually deleted. Deleting either version of the operator works fine on OpenShift 4.8. Has there been some OLM change in 4.9 that could explain this? Any other ideas?

@rkukura
Copy link
Contributor Author

rkukura commented Nov 4, 2021

It seems both this 0.37.1 PR and the existing 0.33.0 bundle have issues on OpenShift 4.9. The opentelemetry-operator-controller-manager pod keeps restarting after logging errors like:

E1104 16:01:14.088765       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.OpenTelemetryCollector: failed to list *v1alpha1.OpenTelemetryCollector: Internal error occurred: error resolving resource
E1104 16:01:20.847304       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.OpenTelemetryCollector: failed to list *v1alpha1.OpenTelemetryCollector: Internal error occurred: error resolving resource
E1104 16:01:44.043345       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.OpenTelemetryCollector: failed to list *v1alpha1.OpenTelemetryCollector: Internal error occurred: error resolving resource
E1104 16:02:25.370013       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.OpenTelemetryCollector: failed to list *v1alpha1.OpenTelemetryCollector: Internal error occurred: error resolving resource
{"level":"error","ts":1636041784.8509786,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Could not wait for Cache to sync","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","error":"failed to wait for opentelemetrycollector caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:195\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:696"}
{"level":"info","ts":1636041784.8514926,"logger":"controller-runtime.webhook","msg":"shutting down webhook server"}
{"level":"error","ts":1636041784.8515916,"logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"failed to list: Timeout: failed waiting for *v1alpha1.OpenTelemetryCollector Informer to sync"}
{"level":"error","ts":1636041784.8528888,"logger":"setup","msg":"problem running manager","error":"failed to wait for opentelemetrycollector caches to sync: timed out waiting for cache to be synced","stacktrace":"main.main\n\t/workspace/main.go:173\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255"}

@rkukura
Copy link
Contributor Author

rkukura commented Nov 10, 2021

The issue causing ci/prow/4.9-deploy-operator-on-openshift to fail is resolved by open-telemetry/opentelemetry-operator#521. I am closing this PR, and will create a new PR updating to either 0.38.0 with the fixed CRD, or to a subsequent release that includes the fix.

@rkukura rkukura closed this Nov 10, 2021
@rkukura rkukura deleted the opentelemetry-operator-0.37.1 branch November 10, 2021 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants