Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

[stable/rabbitmq] When using minikube and PVC, Liveness and Readiness probes fail #14757

Closed
docktermj opened this issue Jun 12, 2019 · 26 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@docktermj
Copy link

Which chart:

stable/rabbitmq

Description

When using minikube, Persistent Volumes (PV) using hostPath fail when used in stable/rabbitmq chart. The Liveness and Readiness warnings prevent the pod from reaching a READY state of "1/1".

$ kubectl describe pods --namespace my-namespace my-rabbitmq-0

:
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  :
  Warning  Unhealthy  12s                kubelet, minikube  Liveness probe failed: {"status":"failed","reason":"resource alarm(s) in effect:[disk]"}
  Warning  Unhealthy  6s (x5 over 2m6s)  kubelet, minikube  Readiness probe failed: {"status":"failed","reason":"resource alarm(s) in effect:[disk]"}

The pod launched by the stable/rabbitmq chart never reaches a READY state of "1/1".

Steps to reproduce the issue:

  1. Start MiniKube
minikube start --cpus 4 --memory 8192 --vm-driver kvm2
  1. Install Tiller
kubectl create serviceaccount -n kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule \
  --clusterrole=cluster-admin \
  --serviceaccount=kube-system:tiller
helm init --service-account tiller
  1. Create namespace.yaml file:
cat <<EOT > namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: my-namespace
  labels:
    name: my-namespace
EOT
  1. Create namespace:
kubectl create -f namespace.yaml
  1. Create persistent-volume-rabbitmq.yaml file:
cat <<EOT > persistent-volume-rabbitmq.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbitmq-volume
  labels:
    type: local
  namespace: my-namespace
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/var/rabbitmq"
EOT
  1. Create persistent volume:
kubectl create -f persistent-volume-rabbitmq.yaml
  1. Create persistent-volume-claim-rabbitmq.yaml file:
cat <<EOT > persistent-volume-claim-rabbitmq.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    cattle.io/creator: norman
  name: rabbitmq-claim
  namespace: my-namespace
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: "manual"
  volumeName: rabbitmq-volume
EOT
  1. Create persistent volume claim:
kubectl create -f persistent-volume-claim-rabbitmq.yaml
  1. Create rabbitmq-values.yaml file:
cat <<EOT > rabbitmq-values.yaml
persistence:
  enabled: true
  existingClaim: rabbitmq-claim

rabbitmq:
  password: passw0rd
  username: user

volumePermissions:
  enabled: true
EOT
  1. Install stable/rabbitmq helm chart.
helm install \
  --name my-rabbitmq \
  --namespace my-namespace \
  --values rabbitmq-values.yaml \
  stable/rabbitmq
  1. Watch for error.
$ kubectl get pods --namespace my-namespace --watch

NAME            READY   STATUS     RESTARTS   AGE
my-rabbitmq-0   0/1     Init:0/1          0    7s
my-rabbitmq-0   0/1     PodInitializing   0   27s
my-rabbitmq-0   0/1     Running           0    2m

READY state never reaches "1/1".

Describe the results you received:

  1. View error.
$ kubectl describe pods --namespace my-namespace my-rabbitmq-0

Name:               my-rabbitmq-0
Namespace:          my-namespace
Priority:           0
PriorityClassName:  <none>
Node:               minikube/192.168.122.139
Start Time:         Wed, 12 Jun 2019 11:04:35 -0400
Labels:             app=rabbitmq
                    chart=rabbitmq-6.0.0
                    controller-revision-hash=my-rabbitmq-5df8f57bbc
                    release=my-rabbitmq
                    statefulset.kubernetes.io/pod-name=my-rabbitmq-0
Annotations:        <none>
Status:             Running
IP:                 172.17.0.5
Controlled By:      StatefulSet/my-rabbitmq
Init Containers:
  volume-permissions:
    Container ID:  docker://f4fdb228c233429466e2c7129d29db56b68ecff9f4ef470e5e13b487bf76af45
    Image:         docker.io/bitnami/minideb:latest
    Image ID:      docker-pullable://bitnami/minideb@sha256:86262eb759137d3a4df9be476cd8f00bf7712678faf9ada8757257b277833d1e
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/chown
      -R
      1001:1001
      /opt/bitnami/rabbitmq/var/lib/rabbitmq
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 12 Jun 2019 11:05:01 -0400
      Finished:     Wed, 12 Jun 2019 11:05:01 -0400
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /opt/bitnami/rabbitmq/var/lib/rabbitmq from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from my-rabbitmq-token-f2krk (ro)
Containers:
  rabbitmq:
    Container ID:  docker://9425370201deeffb03884228831153fee86acc858ad498dcf663fa162da3d7b3
    Image:         docker.io/bitnami/rabbitmq:3.7.15-debian-9-r18
    Image ID:      docker-pullable://bitnami/rabbitmq@sha256:bd8db9774e573c0583660ed1cfcc3c67e3be6148f7474bc805cee7ac4464c683
    Ports:         4369/TCP, 5672/TCP, 25672/TCP, 15672/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      bash
      -ec
      mkdir -p /opt/bitnami/rabbitmq/.rabbitmq/
      mkdir -p /opt/bitnami/rabbitmq/etc/rabbitmq/
      #persist the erlang cookie in both places for server and cli tools
      echo $RABBITMQ_ERL_COOKIE > /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie
      cp /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie /opt/bitnami/rabbitmq/.rabbitmq/
      #change permission so only the user has access to the cookie file
      chmod 600 /opt/bitnami/rabbitmq/.rabbitmq/.erlang.cookie /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie
      #copy the mounted configuration to both places
      cp  /opt/bitnami/rabbitmq/conf/* /opt/bitnami/rabbitmq/etc/rabbitmq
      # Apply resources limits
      ulimit -n "${RABBITMQ_ULIMIT_NOFILES}"
      #replace the default password that is generated
      sed -i "s/CHANGEME/$RABBITMQ_PASSWORD/g" /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf
      #api check for probes
      cat > /opt/bitnami/rabbitmq/sbin/rabbitmq-api-check <<EOF
      #!/bin/sh
      set -e
      URL=\$1
      EXPECTED=\$2
      ACTUAL=\$(curl --silent --show-error --fail "\${URL}")
      echo "\${ACTUAL}"
      test "\${EXPECTED}" = "\${ACTUAL}"
      EOF
      chmod a+x /opt/bitnami/rabbitmq/sbin/rabbitmq-api-check
      exec rabbitmq-server
      
    State:          Running
      Started:      Wed, 12 Jun 2019 11:06:35 -0400
    Ready:          False
    Restart Count:  0
    Liveness:       exec [sh -c rabbitmq-api-check "http://user:[email protected]:15672/api/healthchecks/node" '{"status":"ok"}'] delay=120s timeout=20s period=30s #success=1 #failure=6
    Readiness:      exec [sh -c rabbitmq-api-check "http://user:[email protected]:15672/api/healthchecks/node" '{"status":"ok"}'] delay=10s timeout=20s period=30s #success=1 #failure=3
    Environment:
      BITNAMI_DEBUG:                        false
      MY_POD_IP:                             (v1:status.podIP)
      MY_POD_NAME:                          my-rabbitmq-0 (v1:metadata.name)
      MY_POD_NAMESPACE:                     my-namespace (v1:metadata.namespace)
      K8S_SERVICE_NAME:                     my-rabbitmq-headless
      K8S_ADDRESS_TYPE:                     hostname
      RABBITMQ_NODENAME:                    rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
      K8S_HOSTNAME_SUFFIX:                  .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
      RABBITMQ_LOGS:                        -
      RABBITMQ_ULIMIT_NOFILES:              65536
      RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS:  +S 2:1
      RABBITMQ_USE_LONGNAME:                true
      RABBITMQ_ERL_COOKIE:                  <set to the key 'rabbitmq-erlang-cookie' in secret 'my-rabbitmq'>  Optional: false
      RABBITMQ_PASSWORD:                    <set to the key 'rabbitmq-password' in secret 'my-rabbitmq'>       Optional: false
    Mounts:
      /opt/bitnami/rabbitmq/conf from config-volume (rw)
      /opt/bitnami/rabbitmq/var/lib/rabbitmq from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from my-rabbitmq-token-f2krk (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-rabbitmq-config
    Optional:  false
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  rabbitmq-claim
    ReadOnly:   false
  my-rabbitmq-token-f2krk:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  my-rabbitmq-token-f2krk
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  4m41s              default-scheduler  Successfully assigned my-namespace/my-rabbitmq-0 to minikube
  Normal   Pulling    4m39s              kubelet, minikube  Pulling image "docker.io/bitnami/minideb:latest"
  Normal   Pulled     4m15s              kubelet, minikube  Successfully pulled image "docker.io/bitnami/minideb:latest"
  Normal   Created    4m15s              kubelet, minikube  Created container volume-permissions
  Normal   Started    4m15s              kubelet, minikube  Started container volume-permissions
  Normal   Pulling    4m14s              kubelet, minikube  Pulling image "docker.io/bitnami/rabbitmq:3.7.15-debian-9-r18"
  Normal   Pulled     2m42s              kubelet, minikube  Successfully pulled image "docker.io/bitnami/rabbitmq:3.7.15-debian-9-r18"
  Normal   Created    2m41s              kubelet, minikube  Created container rabbitmq
  Normal   Started    2m41s              kubelet, minikube  Started container rabbitmq
  Warning  Unhealthy  12s                kubelet, minikube  Liveness probe failed: {"status":"failed","reason":"resource alarm(s) in effect:[disk]"}
  Warning  Unhealthy  6s (x5 over 2m6s)  kubelet, minikube  Readiness probe failed: {"status":"failed","reason":"resource alarm(s) in effect:[disk]"}

Describe the results you expected:

A chart that comes up. :-)

Additional information you deem important (e.g. issue happens only occasionally):

When running without PV/PVC, the chart comes up properly.

Version of Helm and Kubernetes:

  • Output of helm version:
$ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
  • Output of kubectl version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:45:25Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
  • Output of minikube:
$ minikube version
minikube version: v1.0.0

Cleanup

helm delete --purge my-rabbitmq
kubectl delete -f persistent-volume-claim-rabbitmq.yaml
kubectl delete -f persistent-volume-rabbitmq.yaml
kubectl delete -f namespace.yaml
minikube stop
minikube delete
@docktermj
Copy link
Author

Log:

$ kubectl logs --namespace my-namespace my-rabbitmq-0

  ##  ##
  ##  ##      RabbitMQ 3.7.15. Copyright (C) 2007-2019 Pivotal Software, Inc.
  ##########  Licensed under the MPL.  See https://www.rabbitmq.com/
  ######  ##
  ##########  Logs: <stdout>

              Starting broker...
2019-06-12 15:40:26.000 [info] <0.226.0> 
 Starting RabbitMQ 3.7.15 on Erlang 22.0
 Copyright (C) 2007-2019 Pivotal Software, Inc.
 Licensed under the MPL.  See https://www.rabbitmq.com/
2019-06-12 15:40:26.012 [info] <0.226.0> 
 node           : rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local
 home dir       : /opt/bitnami/rabbitmq/.rabbitmq
 config file(s) : /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf
 cookie hash    : tt8gTT+g7Hkm87a2Rcg39Q==
 log(s)         : <stdout>
 database dir   : /opt/bitnami/rabbitmq/var/lib/rabbitmq/mnesia/rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local
2019-06-12 15:40:29.131 [info] <0.226.0> Running boot step pre_boot defined by app rabbit
2019-06-12 15:40:29.131 [info] <0.226.0> Running boot step rabbit_core_metrics defined by app rabbit
2019-06-12 15:40:29.131 [info] <0.226.0> Running boot step rabbit_alarm defined by app rabbit
2019-06-12 15:40:29.137 [info] <0.234.0> Memory high watermark set to 3039 MiB (3187254886 bytes) of 7599 MiB (7968137216 bytes) total
2019-06-12 15:40:29.144 [info] <0.236.0> Enabling free disk space monitoring
2019-06-12 15:40:29.144 [info] <0.236.0> Disk free limit set to 50MB
2019-06-12 15:40:29.149 [info] <0.236.0> Free disk space is insufficient. Free bytes: 0. Limit: 50000000
2019-06-12 15:40:29.150 [warning] <0.232.0> disk resource limit alarm set on node 'rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local'.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
2019-06-12 15:40:29.150 [info] <0.226.0> Running boot step code_server_cache defined by app rabbit
2019-06-12 15:40:29.150 [info] <0.226.0> Running boot step file_handle_cache defined by app rabbit
2019-06-12 15:40:29.150 [info] <0.239.0> Limiting to approx 65436 file handles (58890 sockets)
2019-06-12 15:40:29.151 [info] <0.240.0> FHC read buffering:  OFF
2019-06-12 15:40:29.151 [info] <0.240.0> FHC write buffering: ON
2019-06-12 15:40:29.151 [info] <0.226.0> Running boot step worker_pool defined by app rabbit
2019-06-12 15:40:29.152 [info] <0.226.0> Running boot step database defined by app rabbit
2019-06-12 15:40:29.153 [info] <0.226.0> Node database directory at /opt/bitnami/rabbitmq/var/lib/rabbitmq/mnesia/rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
2019-06-12 15:40:29.153 [info] <0.226.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2019-06-12 15:40:29.153 [info] <0.226.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2019-06-12 15:40:29.153 [info] <0.226.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-06-12 15:40:29.154 [info] <0.226.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2019-06-12 15:40:29.207 [info] <0.226.0> k8s endpoint listing returned nodes not yet ready: my-rabbitmq-0
2019-06-12 15:40:29.207 [info] <0.226.0> All discovered existing cluster peers: 
2019-06-12 15:40:29.208 [info] <0.226.0> Discovered no peer nodes to cluster with
2019-06-12 15:40:29.213 [info] <0.43.0> Application mnesia exited with reason: stopped
2019-06-12 15:40:29.286 [info] <0.226.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2019-06-12 15:40:29.322 [info] <0.226.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2019-06-12 15:40:29.356 [info] <0.226.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2019-06-12 15:40:29.356 [info] <0.226.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping registration.
2019-06-12 15:40:29.356 [info] <0.226.0> Running boot step database_sync defined by app rabbit
2019-06-12 15:40:29.356 [info] <0.226.0> Running boot step codec_correctness_check defined by app rabbit
2019-06-12 15:40:29.356 [info] <0.226.0> Running boot step external_infrastructure defined by app rabbit
2019-06-12 15:40:29.356 [info] <0.226.0> Running boot step rabbit_registry defined by app rabbit
2019-06-12 15:40:29.356 [info] <0.226.0> Running boot step rabbit_auth_mechanism_cr_demo defined by app rabbit
2019-06-12 15:40:29.357 [info] <0.226.0> Running boot step rabbit_queue_location_random defined by app rabbit
2019-06-12 15:40:29.357 [info] <0.226.0> Running boot step rabbit_event defined by app rabbit
2019-06-12 15:40:29.357 [info] <0.226.0> Running boot step rabbit_auth_mechanism_amqplain defined by app rabbit
2019-06-12 15:40:29.357 [info] <0.226.0> Running boot step rabbit_auth_mechanism_plain defined by app rabbit
2019-06-12 15:40:29.357 [info] <0.226.0> Running boot step rabbit_exchange_type_direct defined by app rabbit
2019-06-12 15:40:29.358 [info] <0.226.0> Running boot step rabbit_exchange_type_fanout defined by app rabbit
2019-06-12 15:40:29.358 [info] <0.226.0> Running boot step rabbit_exchange_type_headers defined by app rabbit
2019-06-12 15:40:29.358 [info] <0.226.0> Running boot step rabbit_exchange_type_topic defined by app rabbit
2019-06-12 15:40:29.358 [info] <0.226.0> Running boot step rabbit_mirror_queue_mode_all defined by app rabbit
2019-06-12 15:40:29.358 [info] <0.226.0> Running boot step rabbit_mirror_queue_mode_exactly defined by app rabbit
2019-06-12 15:40:29.359 [info] <0.226.0> Running boot step rabbit_mirror_queue_mode_nodes defined by app rabbit
2019-06-12 15:40:29.359 [info] <0.226.0> Running boot step rabbit_priority_queue defined by app rabbit
2019-06-12 15:40:29.359 [info] <0.226.0> Priority queues enabled, real BQ is rabbit_variable_queue
2019-06-12 15:40:29.359 [info] <0.226.0> Running boot step rabbit_queue_location_client_local defined by app rabbit
2019-06-12 15:40:29.359 [info] <0.226.0> Running boot step rabbit_queue_location_min_masters defined by app rabbit
2019-06-12 15:40:29.360 [info] <0.226.0> Running boot step kernel_ready defined by app rabbit
2019-06-12 15:40:29.360 [info] <0.226.0> Running boot step rabbit_sysmon_minder defined by app rabbit
2019-06-12 15:40:29.360 [info] <0.226.0> Running boot step rabbit_epmd_monitor defined by app rabbit
2019-06-12 15:40:29.361 [info] <0.226.0> Running boot step guid_generator defined by app rabbit
2019-06-12 15:40:29.362 [info] <0.226.0> Running boot step rabbit_node_monitor defined by app rabbit
2019-06-12 15:40:29.362 [info] <0.413.0> Starting rabbit_node_monitor
2019-06-12 15:40:29.362 [info] <0.226.0> Running boot step delegate_sup defined by app rabbit
2019-06-12 15:40:29.364 [info] <0.226.0> Running boot step rabbit_memory_monitor defined by app rabbit
2019-06-12 15:40:29.364 [info] <0.226.0> Running boot step core_initialized defined by app rabbit
2019-06-12 15:40:29.364 [info] <0.226.0> Running boot step upgrade_queues defined by app rabbit
2019-06-12 15:40:29.398 [info] <0.226.0> message_store upgrades: 1 to apply
2019-06-12 15:40:29.398 [info] <0.226.0> message_store upgrades: Applying rabbit_variable_queue:move_messages_to_vhost_store
2019-06-12 15:40:29.399 [info] <0.226.0> message_store upgrades: No durable queues found. Skipping message store migration
2019-06-12 15:40:29.399 [info] <0.226.0> message_store upgrades: Removing the old message store data
2019-06-12 15:40:29.400 [info] <0.226.0> message_store upgrades: All upgrades applied successfully
2019-06-12 15:40:29.434 [info] <0.226.0> Running boot step rabbit_connection_tracking defined by app rabbit
2019-06-12 15:40:29.435 [info] <0.226.0> Running boot step rabbit_connection_tracking_handler defined by app rabbit
2019-06-12 15:40:29.435 [info] <0.226.0> Running boot step rabbit_exchange_parameters defined by app rabbit
2019-06-12 15:40:29.435 [info] <0.226.0> Running boot step rabbit_mirror_queue_misc defined by app rabbit
2019-06-12 15:40:29.436 [info] <0.226.0> Running boot step rabbit_policies defined by app rabbit
2019-06-12 15:40:29.437 [info] <0.226.0> Running boot step rabbit_policy defined by app rabbit
2019-06-12 15:40:29.437 [info] <0.226.0> Running boot step rabbit_queue_location_validator defined by app rabbit
2019-06-12 15:40:29.437 [info] <0.226.0> Running boot step rabbit_vhost_limit defined by app rabbit
2019-06-12 15:40:29.438 [info] <0.226.0> Running boot step rabbit_mgmt_reset_handler defined by app rabbitmq_management
2019-06-12 15:40:29.438 [info] <0.226.0> Running boot step rabbit_mgmt_db_handler defined by app rabbitmq_management_agent
2019-06-12 15:40:29.438 [info] <0.226.0> Management plugin: using rates mode 'basic'
2019-06-12 15:40:29.438 [info] <0.226.0> Running boot step recovery defined by app rabbit
2019-06-12 15:40:29.439 [info] <0.226.0> Running boot step load_definitions defined by app rabbitmq_management
2019-06-12 15:40:29.439 [info] <0.226.0> Running boot step empty_db_check defined by app rabbit
2019-06-12 15:40:29.440 [info] <0.226.0> Adding vhost '/'
2019-06-12 15:40:29.445 [info] <0.454.0> Making sure data directory '/opt/bitnami/rabbitmq/var/lib/rabbitmq/mnesia/rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L' for vhost '/' exists
2019-06-12 15:40:29.448 [info] <0.454.0> Starting message stores for vhost '/'
2019-06-12 15:40:29.448 [info] <0.458.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_transient": using rabbit_msg_store_ets_index to provide index
2019-06-12 15:40:29.450 [info] <0.454.0> Started message store of type transient for vhost '/'
2019-06-12 15:40:29.451 [info] <0.461.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": using rabbit_msg_store_ets_index to provide index
2019-06-12 15:40:29.452 [warning] <0.461.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": rebuilding indices from scratch
2019-06-12 15:40:29.453 [info] <0.454.0> Started message store of type persistent for vhost '/'
2019-06-12 15:40:29.456 [info] <0.226.0> Creating user 'user'
2019-06-12 15:40:29.457 [info] <0.226.0> Setting user tags for user 'user' to [administrator]
2019-06-12 15:40:29.457 [info] <0.226.0> Setting permissions for 'user' in '/' to '.*', '.*', '.*'
2019-06-12 15:40:29.458 [info] <0.226.0> Running boot step rabbit_looking_glass defined by app rabbit
2019-06-12 15:40:29.458 [info] <0.226.0> Running boot step rabbit_core_metrics_gc defined by app rabbit
2019-06-12 15:40:29.458 [info] <0.226.0> Running boot step background_gc defined by app rabbit
2019-06-12 15:40:29.459 [info] <0.226.0> Running boot step connection_tracking defined by app rabbit
2019-06-12 15:40:29.461 [info] <0.226.0> Setting up a table for connection tracking on this node: 'tracked_connection_on_node_rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local'
2019-06-12 15:40:29.464 [info] <0.226.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_per_vhost_on_node_rabbit@my-rabbitmq-0.my-rabbitmq-headless.my-namespace.svc.cluster.local'
2019-06-12 15:40:29.464 [info] <0.226.0> Running boot step routing_ready defined by app rabbit
2019-06-12 15:40:29.464 [info] <0.226.0> Running boot step pre_flight defined by app rabbit
2019-06-12 15:40:29.465 [info] <0.226.0> Running boot step notify_cluster defined by app rabbit
2019-06-12 15:40:29.465 [info] <0.226.0> Running boot step networking defined by app rabbit
2019-06-12 15:40:29.468 [warning] <0.493.0> Setting Ranch options together with socket options is deprecated. Please use the new map syntax that allows specifying socket options separately from other options.
2019-06-12 15:40:29.469 [info] <0.507.0> started TCP listener on [::]:5672
2019-06-12 15:40:29.470 [info] <0.226.0> Running boot step direct_client defined by app rabbit
2019-06-12 15:40:29.476 [info] <0.559.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 10 seconds.
2019-06-12 15:40:29.518 [info] <0.563.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2019-06-12 15:40:29.518 [info] <0.669.0> Statistics database started.
2019-06-12 15:40:29.959 [info] <0.8.0> Server startup complete; 5 plugins started.
 * rabbitmq_management
 * rabbitmq_peer_discovery_k8s
 * rabbitmq_peer_discovery_common
 * rabbitmq_web_dispatch
 * rabbitmq_management_agent
 completed with 5 plugins.

@docktermj
Copy link
Author

I see this line in the log:

2019-06-12 15:40:29.149 [info] <0.236.0> Free disk space is insufficient. Free bytes: 0. Limit: 50000000

I don't understand. I have the PV at storage: 10Gi

@docktermj
Copy link
Author

In another test, we tried the stable/rabbitmq chart on the IBM Cloud Pak for Data kubernetes system and the chart deployed, but no pods came up.

We did a

kubectl get pods --namespace zen --watch

Didn't see the anything for the pod (i.e. not CreatingContainer, Init:0/1, PodInitializing, nor Running).

@docktermj
Copy link
Author

Just to make sure I'm not missing something, I just performed

helm repo update

But it did not solve the issue.

@tompizmor
Copy link
Collaborator

Hi @docktermj!

Thanks for all the detailed instructions! I was able to reproduce the issue in Minikube.

Can I ask why do you create the PV and PVC manually? I tried deploying the chart with the default configuration and it worked fine. So I suspect the problem is using a hostPath volume.

Regarding the issue in IBM Cloud park, could you run the following command to see why the stateful set is not creating the pods?

kubectl describe sts STATEFULSET-NAME

@docktermj
Copy link
Author

Hi @tompizmor!

On the question of PV and PVC... I'm creating a series of "reference implementations" at https://github.com/Senzing/kubernetes-demo to help users understand how to bring up a docker formation using minikube. The best teaching mechanism I know of so far is to manually create PV/PVC. Do you have a suggestion for improvement?

I, too, was able to get the default to work, but I didn't that think it was sufficient for a "teaching exercise".

I ran into a similar issue in #14390 and it has been fixed thanks to @juan131.

@docktermj
Copy link
Author

docktermj commented Jun 18, 2019

@rgahockey As described 2 posts up, can you run the following command:

kubectl describe sts --namespace zen senzing-rabbitmq

...at least I think that's the STATEFULSET-NAME @tompizmor is asking for.

@tompizmor , As background for you, the instructions we are crafting are at
https://github.com/Senzing/ibm-icp4d-guide/blob/master/docs/helm-rabbitmq-db2/README.md

@rgahockey
Copy link

[root@rga-icp4d01-icpmaster-0 ~]# kubectl describe sts --namespace zen senzing-rabbitmq
Name: senzing-rabbitmq
Namespace: zen
CreationTimestamp: Wed, 12 Jun 2019 09:35:32 -0500
Selector: app=rabbitmq,release=senzing-rabbitmq
Labels: app=rabbitmq
chart=rabbitmq-6.0.2
heritage=Tiller
release=senzing-rabbitmq
Annotations:
Replicas: 1 desired | 0 total
Update Strategy: RollingUpdate
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=rabbitmq
chart=rabbitmq-6.0.2
release=senzing-rabbitmq
Service Account: senzing-rabbitmq
Init Containers:
volume-permissions:
Image: docker.io/bitnami/minideb:latest
Port:
Host Port:
Command:
/bin/chown
-R
1001:1001
/opt/bitnami/rabbitmq/var/lib/rabbitmq
Environment:
Mounts:
/opt/bitnami/rabbitmq/var/lib/rabbitmq from data (rw)
Containers:
rabbitmq:
Image: docker.io/bitnami/rabbitmq:3.7.15-debian-9-r25
Ports: 4369/TCP, 5672/TCP, 25672/TCP, 15672/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
bash
-ec
mkdir -p /opt/bitnami/rabbitmq/.rabbitmq/
mkdir -p /opt/bitnami/rabbitmq/etc/rabbitmq/
#persist the erlang cookie in both places for server and cli tools
echo $RABBITMQ_ERL_COOKIE > /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie
cp /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie /opt/bitnami/rabbitmq/.rabbitmq/
#change permission so only the user has access to the cookie file
chmod 600 /opt/bitnami/rabbitmq/.rabbitmq/.erlang.cookie /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie
#copy the mounted configuration to both places
cp /opt/bitnami/rabbitmq/conf/* /opt/bitnami/rabbitmq/etc/rabbitmq
# Apply resources limits
ulimit -n "${RABBITMQ_ULIMIT_NOFILES}"
#replace the default password that is generated
sed -i "s/CHANGEME/$RABBITMQ_PASSWORD/g" /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf
#api check for probes
cat > /opt/bitnami/rabbitmq/sbin/rabbitmq-api-check <<EOF
#!/bin/sh
set -e
URL=$1
EXPECTED=$2
ACTUAL=$(curl --silent --show-error --fail "${URL}")
echo "${ACTUAL}"
test "${EXPECTED}" = "${ACTUAL}"
EOF
chmod a+x /opt/bitnami/rabbitmq/sbin/rabbitmq-api-check
exec rabbitmq-server

Environment:
  BITNAMI_DEBUG:                        false
  MY_POD_IP:                             (v1:status.podIP)
  MY_POD_NAME:                           (v1:metadata.name)
  MY_POD_NAMESPACE:                      (v1:metadata.namespace)
  K8S_SERVICE_NAME:                     senzing-rabbitmq-headless
  K8S_ADDRESS_TYPE:                     hostname
  RABBITMQ_NODENAME:                    rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
  K8S_HOSTNAME_SUFFIX:                  .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
  RABBITMQ_LOGS:                        -
  RABBITMQ_ULIMIT_NOFILES:              65536
  RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS:  +S 2:1
  RABBITMQ_USE_LONGNAME:                true
  RABBITMQ_ERL_COOKIE:                  <set to the key 'rabbitmq-erlang-cookie' in secret 'senzing-rabbitmq'>  Optional: false
  RABBITMQ_PASSWORD:                    <set to the key 'rabbitmq-password' in secret 'senzing-rabbitmq'>       Optional: false
Mounts:
  /opt/bitnami/rabbitmq/conf from config-volume (rw)
  /opt/bitnami/rabbitmq/var/lib/rabbitmq from data (rw)

Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: senzing-rabbitmq-config
Optional: false
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: user-home-pvc
ReadOnly: false
Volume Claims:
Events:
Type Reason Age From Message


Warning FailedCreate 3m29s (x18059 over 6d2h) statefulset-controller create Pod senzing-rabbitmq-0 in StatefulSet senzing-rabbitmq failed error: pods "senzing-rabbitmq-0" is forbidden: unable to validate against any pod security policy: [spec.initContainers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden]

@docktermj
Copy link
Author

Formatted:

[root@rga-icp4d01-icpmaster-0 ~]# kubectl describe sts --namespace zen senzing-rabbitmq
Name:               senzing-rabbitmq
Namespace:          zen
CreationTimestamp:  Wed, 12 Jun 2019 09:35:32 -0500
Selector:           app=rabbitmq,release=senzing-rabbitmq
Labels:             app=rabbitmq
                    chart=rabbitmq-6.0.2
                    heritage=Tiller
                    release=senzing-rabbitmq
Annotations:        <none>
Replicas:           1 desired | 0 total
Update Strategy:    RollingUpdate
Pods Status:        0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=rabbitmq
                    chart=rabbitmq-6.0.2
                    release=senzing-rabbitmq
  Service Account:  senzing-rabbitmq
  Init Containers:
   volume-permissions:
    Image:      docker.io/bitnami/minideb:latest
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/chown
      -R
      1001:1001
      /opt/bitnami/rabbitmq/var/lib/rabbitmq
    Environment:  <none>
    Mounts:
      /opt/bitnami/rabbitmq/var/lib/rabbitmq from data (rw)
  Containers:
   rabbitmq:
    Image:       docker.io/bitnami/rabbitmq:3.7.15-debian-9-r25
    Ports:       4369/TCP, 5672/TCP, 25672/TCP, 15672/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      bash
      -ec
      mkdir -p /opt/bitnami/rabbitmq/.rabbitmq/
      mkdir -p /opt/bitnami/rabbitmq/etc/rabbitmq/
      #persist the erlang cookie in both places for server and cli tools
      echo $RABBITMQ_ERL_COOKIE > /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie
      cp /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie /opt/bitnami/rabbitmq/.rabbitmq/
      #change permission so only the user has access to the cookie file
      chmod 600 /opt/bitnami/rabbitmq/.rabbitmq/.erlang.cookie /opt/bitnami/rabbitmq/var/lib/rabbitmq/.erlang.cookie
      #copy the mounted configuration to both places
      cp  /opt/bitnami/rabbitmq/conf/* /opt/bitnami/rabbitmq/etc/rabbitmq
      # Apply resources limits
      ulimit -n "${RABBITMQ_ULIMIT_NOFILES}"
      #replace the default password that is generated
      sed -i "s/CHANGEME/$RABBITMQ_PASSWORD/g" /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf
      #api check for probes
      cat > /opt/bitnami/rabbitmq/sbin/rabbitmq-api-check <<EOF
      #!/bin/sh
      set -e
      URL=\$1
      EXPECTED=\$2
      ACTUAL=\$(curl --silent --show-error --fail "\${URL}")
      echo "\${ACTUAL}"
      test "\${EXPECTED}" = "\${ACTUAL}"
      EOF
      chmod a+x /opt/bitnami/rabbitmq/sbin/rabbitmq-api-check
      exec rabbitmq-server
    Environment:
      BITNAMI_DEBUG:                        false
      MY_POD_IP:                             (v1:status.podIP)
      MY_POD_NAME:                           (v1:metadata.name)
      MY_POD_NAMESPACE:                      (v1:metadata.namespace)
      K8S_SERVICE_NAME:                     senzing-rabbitmq-headless
      K8S_ADDRESS_TYPE:                     hostname
      RABBITMQ_NODENAME:                    rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
      K8S_HOSTNAME_SUFFIX:                  .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
      RABBITMQ_LOGS:                        -
      RABBITMQ_ULIMIT_NOFILES:              65536
      RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS:  +S 2:1
      RABBITMQ_USE_LONGNAME:                true
      RABBITMQ_ERL_COOKIE:                  <set to the key 'rabbitmq-erlang-cookie' in secret 'senzing-rabbitmq'>  Optional: false
      RABBITMQ_PASSWORD:                    <set to the key 'rabbitmq-password' in secret 'senzing-rabbitmq'>       Optional: false
    Mounts:
      /opt/bitnami/rabbitmq/conf from config-volume (rw)
      /opt/bitnami/rabbitmq/var/lib/rabbitmq from data (rw)
  Volumes:
   config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      senzing-rabbitmq-config
    Optional:  false
   data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  user-home-pvc
    ReadOnly:   false
Volume Claims:  <none>
Events:
  Type     Reason        Age                       From                    Message
  ----     ------        ----                      ----                    -------
  Warning  FailedCreate  3m29s (x18059 over 6d2h)  statefulset-controller  create Pod senzing-rabbitmq-0 in StatefulSet senzing-rabbitmq failed error: pods "senzing-rabbitmq-0" is forbidden: unable to validate against any pod security policy: [spec.initContainers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden]

@docktermj
Copy link
Author

So for IBM Cloud Pak, I think this is the story:

The RabbitMQ chart needs to change ownership of the files in the "persistence path".
It does that with an initContainer in statefulset.yaml.

However the security policies in IBM Cloud Pak seem to prevent that from happening.
So @rgahockey, you and I can continue down that path (which isn't really part of this GitHub issue.)

@docktermj
Copy link
Author

@tompizmor, is there anything else we can do or provide to help get the original issue resolved?

@rgahockey
Copy link

rgahockey commented Jun 18, 2019

It appears back in Dec 19, 2018 there was supposed to be a way to set the SecurityContext ...

Hi @obeyler @desaintmartin

The initContainer was added to avoid issues on some K8s distributions were SecurityContext wasn't respected. Therefore, we needed a way to ensure the volumes had the proper permissions.

What do you think about adding a parameter such as initContainer.enabled that can be set to true/false so you can decide whether to install the chart with it or not? cc/ @tompizmor
@tompizmor
Collaborator
tompizmor commented on Dec 19, 2018

Agree. We should have enough flexibility to enable/disable both the securityContext and the initContainer.

@obeyler
Copy link
Contributor

obeyler commented Jun 18, 2019

@rgahockey I think that it's mandatory to have a way to disable the initContainer as if you run it in a secured K8S with PodSecurityPolicy runAsNonRoot the initContainer with container run as root won't be accepted.
So I agree that it's a good idea to have initContainer.enabled parameter
I'm less sure for the need of such tag to enable or not SecurityContext. At the end, SecurityContext is always present in yaml file (even as empty object).

@docktermj
Copy link
Author

docktermj commented Jun 18, 2019

Good conversation. Can it be moved to a new issue?

This issue is for Liveness and Readiness probes on minikube.

@desaintmartin
Copy link
Collaborator

Isn't this already the case?
.Values.volumePermissions.enabled allows to define (or not define) the initContainer.

@tompizmor
Copy link
Collaborator

@docktermj

The best teaching mechanism I know of so far is to manually create PV/PVC. Do you have a suggestion for improvement?

For me is weird having to create the PV manually as you would usually have a storage provider that will that automatically for you.

The original issue was that liveness and readiness probes fail when you use an already existing hostPath volume. In that case, you can see the following error:

2019-06-12 15:40:29.149 [info] <0.236.0> Free disk space is insufficient. Free bytes: 0. Limit: 50000000

My take on this is that RabbitMQ is not able to measure the available space when you mount a hostPath volume. However, right now I have no idea how to debug this.

@docktermj
Copy link
Author

@tompizmor I agree that a storage provider would set up the PV, but in my case I'm educating folks using minikube, so I need to demonstrate the PV/PVC concept.

Yikes! Are we dead in the water? You mentioned that you could recreate. Is there anything I can do to move the ball forward?

@miguelaeh
Copy link
Collaborator

miguelaeh commented Jun 20, 2019

Hi @docktermj ,
Why not use a single Pod mounting volumes to demonstrate the PV/PVC concept?
The majority of Helm Charts are prepared to use dynamic provisioning, did you try with other Chart to know if it only happens for RabbitMQ or is a general issue?

@docktermj
Copy link
Author

Hi @tompizmor!

Is there any plan to fix this issue on minikube?

In the meantime, I'm using:

livenessProbe:
  enabled: false

readinessProbe:
  enabled: false

Which is probably not an illustration of "best practice".

@stale
Copy link

stale bot commented Aug 7, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 7, 2019
@docktermj
Copy link
Author

Any update on

livenessProbe:
  enabled: true

readinessProbe:
  enabled: true

issues on Qinikube

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 9, 2019
@stale
Copy link

stale bot commented Sep 8, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 8, 2019
@stale
Copy link

stale bot commented Sep 22, 2019

This issue is being automatically closed due to inactivity.

@stale stale bot closed this as completed Sep 22, 2019
@docktermj
Copy link
Author

If this issue is not fixed, please reopen.

@docktermj
Copy link
Author

Hi @tompizmor!

Why is this a stale issue? Has it been fixed?

@juan131
Copy link
Collaborator

juan131 commented Oct 4, 2019

/reopen

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

7 participants