Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed calling webhook "mopentelemetrycollector.kb.io" #652

Closed
rbaumgar opened this issue Dec 22, 2021 · 6 comments · Fixed by #697
Closed

failed calling webhook "mopentelemetrycollector.kb.io" #652

rbaumgar opened this issue Dec 22, 2021 · 6 comments · Fixed by #697
Labels
area:collector Issues for deploying collector bug Something isn't working good first issue Good for newcomers

Comments

@rbaumgar
Copy link

full error message

Error from server (InternalError): error when creating "abc": Internal error occurred: failed calling webhook "mopentelemetrycollector.kb.io": failed to call webhook: Post "https://opentelemetry-operator-controller-manager-service.openshift-operators.svc:443/mutate-opentelemetry-io-v1alpha1-opentelemetrycollector?timeout=10s": dial tcp 10.131.0.154:9443: connect: connection refused

Reason:

  • Operator is installed in openshift-operators
  • some other operators are installed in the same namespace, like kogito, gitops
  • service "opentelemetry-operator-controller-manager-service" has as podselector defined "control-plane=controller-manager"
    --> the service is also pointing to the wrong pods from the other operators...
    Sometimes it does not work, sometimes it works, depended which pod the service is selecting!
$ oc get pod -l control-plane=controller-manager
NAME                                                         READY   STATUS    RESTARTS      AGE
gitops-operator-controller-manager-54d4756897-7gczv          1/1     Running   0             31h
kogito-operator-controller-manager-7d5fc8f765-kntnr          2/2     Running   4 (28h ago)   28h
opentelemetry-operator-controller-manager-69f7f56598-z8dck   2/2     Running   0             56m
@jpkrohling
Copy link
Member

This might be related to #521.

cc @rkukura, @VineethReddy02, @pavolloffay

@jpkrohling jpkrohling added the bug Something isn't working label Dec 27, 2021
@jpkrohling
Copy link
Member

jpkrohling commented Dec 27, 2021

Or not: perhaps we just need to better qualify the selector?

selector:
control-plane: controller-manager

metadata:
name: controller-manager
namespace: system
labels:
control-plane: controller-manager
spec:
selector:
matchLabels:
control-plane: controller-manager
replicas: 1
template:
metadata:
labels:
control-plane: controller-manager

@rbaumgar
Copy link
Author

May be a good idea to add something like "app.kubernetes.io/name=simplest-collector"...
Should be a recommendation by the Operator SDK.

@jpkrohling
Copy link
Member

This sounds like a good first issue. Would you like to try it out, @rbaumgar?

@jpkrohling jpkrohling added the good first issue Good for newcomers label Dec 27, 2021
@rbaumgar
Copy link
Author

@jpkrohling sorry looked at wrong pod.
The controller-manager has only the label "pod-template-hash: 69f7f56598".

So I added the last line to the service

selector:
    control-plane: controller-manager
    pod-template-hash: 69f7f56598

Works perfect!

@pavolloffay pavolloffay added the area:collector Issues for deploying collector label Jan 31, 2022
@pavolloffay
Copy link
Member

Adding some debug info

k get all -n opentelemetry-operator-system                                                                                                                                                                                                                                                                          130 ↵ ploffay@fedora
NAME                                                             READY   STATUS    RESTARTS   AGE
pod/opentelemetry-operator-controller-manager-79b77945bf-bw5lq   2/2     Running   0          15m

NAME                                                                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/opentelemetry-operator-controller-manager-metrics-service   ClusterIP   10.111.133.184   <none>        8443/TCP   15m
service/opentelemetry-operator-webhook-service                      ClusterIP   10.97.165.245    <none>        443/TCP    15m

NAME                                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/opentelemetry-operator-controller-manager   1/1     1            1           15m

NAME                                                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/opentelemetry-operator-controller-manager-79b77945bf   1         1         1       15m
k describe service/opentelemetry-operator-webhook-service -n opentelemetry-operator-system                                                                                                                                                                                                                                ploffay@fedora
Name:              opentelemetry-operator-webhook-service
Namespace:         opentelemetry-operator-system
Labels:            <none>
Annotations:       <none>
Selector:          control-plane=controller-manager
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.97.165.245
IPs:               10.97.165.245
Port:              <unset>  443/TCP
TargetPort:        9443/TCP
Endpoints:         172.17.0.6:9443
Session Affinity:  None
Events:            <none>
k describe deployment.apps/opentelemetry-operator-controller-manager -n opentelemetry-operator-system                                                                                                                                                                                                               130 ↵ ploffay@fedora
Name:                   opentelemetry-operator-controller-manager
Namespace:              opentelemetry-operator-system
CreationTimestamp:      Wed, 09 Feb 2022 10:00:26 +0100
Labels:                 control-plane=controller-manager
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               control-plane=controller-manager
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           control-plane=controller-manager
  Service Account:  opentelemetry-operator-controller-manager
  Containers:
   kube-rbac-proxy:
    Image:      gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
    Port:       8443/TCP
    Host Port:  0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --v=0
    Limits:
      cpu:     500m
      memory:  128Mi
    Requests:
      cpu:        5m
      memory:     64Mi
    Environment:  <none>
    Mounts:       <none>
   manager:
    Image:      docker.io/pavolloffay/opentelemetry-operator:810
    Port:       9443/TCP
    Host Port:  0/TCP
    Args:
      --metrics-addr=127.0.0.1:8080
      --enable-leader-election
    Limits:
      cpu:     200m
      memory:  256Mi
    Requests:
      cpu:        100m
      memory:     64Mi
    Liveness:     http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:    http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /tmp/k8s-webhook-server/serving-certs from cert (ro)
  Volumes:
   cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  opentelemetry-operator-controller-manager-service-cert
    Optional:    false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   opentelemetry-operator-controller-manager-79b77945bf (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  15m   deployment-controller  Scaled up replica set opentelemetry-operator-controller-manager-79b77945bf to 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:collector Issues for deploying collector bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants