Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean deployment on K3s with 2.14.0 fails with postgres 15 #1806

Closed
3 tasks done
apiening opened this issue Apr 3, 2024 · 4 comments
Closed
3 tasks done

Clean deployment on K3s with 2.14.0 fails with postgres 15 #1806

apiening opened this issue Apr 3, 2024 · 4 comments

Comments

@apiening
Copy link

apiening commented Apr 3, 2024

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

Bug Summary

The deployment of awx-postgres-15-0 fails with a CrashLoopBackOff.
This is a new K3s deployment with nothing else running, no upgrade.

AWX Operator version

2.14.0

AWX version

v24.1.0

Kubernetes platform

kubernetes

Kubernetes/Platform version

K3S Client Version: v1.28.7+k3s1, Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3, Server Version: v1.28.7+k3s1

Modifications

no

Steps to reproduce

Deploy using kustomization.yaml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  # Find the latest tag here: https://github.com/ansible/awx-operator/releases
  - github.com/ansible/awx-operator/config/default?ref=2.14.0
  - awx.yaml

# Set the image tags to match the git version from above
images:
  - name: quay.io/ansible/awx-operator
    newTag: 2.14.0

# Specify a custom namespace in which to install AWX
namespace: awx

and awx.yaml:

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
spec:
  service_type: nodeport
  nodeport_port: 3000

with

kubectl kustomize | kubectl apply -f -

Expected results

Successful deployment.

Actual results

Postgres is in a CrashLoop:

# kubectl get pods -n awx
NAME                                               READY   STATUS             RESTARTS         AGE
awx-operator-controller-manager-6458cd4798-w6vwx   2/2     Running            0                91m
awx-postgres-15-0                                  0/1     CrashLoopBackOff   20 (4m40s ago)   82m

Additional information

This is a clean deployment with nothing else running.

Operator Logs

No response

@fosterseth
Copy link
Member

@apiening can you provide the output of kubectl describe awx-postgres-15-0 so we can see the reason for the crashloopbackoff?

@apiening
Copy link
Author

apiening commented Apr 3, 2024

Hi @fosterseth,
sure, but there is no sensible output:

# kubectl describe awx-postgres-15-0
error: the server doesn't have a resource type "awx-postgres-15-0"

@d-rupp
Copy link

d-rupp commented Apr 9, 2024

I believe the correct query is kubectl describe pods/awx-postgres-15-0

Here is what happens for me. Also trying to update to 2.14.0 with k3s & longhorn

% kubectl describe pods/awx-postgres-15-0
Name:             awx-postgres-15-0
Namespace:        awx
Priority:         0
Service Account:  default
Node:             ks2.dev/10.22.33.224
Start Time:       Tue, 09 Apr 2024 14:44:38 +0200
Labels:           app.kubernetes.io/component=database
                  app.kubernetes.io/instance=postgres-15-awx
                  app.kubernetes.io/managed-by=awx-operator
                  app.kubernetes.io/name=postgres-15
                  app.kubernetes.io/part-of=awx
                  apps.kubernetes.io/pod-index=0
                  controller-revision-hash=awx-postgres-15-867686c85b
                  statefulset.kubernetes.io/pod-name=awx-postgres-15-0
Annotations:      <none>
Status:           Running
IP:               10.42.1.64
IPs:
  IP:           10.42.1.64
Controlled By:  StatefulSet/awx-postgres-15
Containers:
  postgres:
    Container ID:  containerd://32d74bbdd5affb31d729fac4f21968c0f163b6c4dcfea6fcc9ad391f5142e110
    Image:         quay.io/sclorg/postgresql-15-c9s:latest
    Image ID:      quay.io/sclorg/postgresql-15-c9s@sha256:e94a89dfb3414a6eb21a716cc1f999c576e6c966351798c5b86dce06cf429272
    Port:          5432/TCP
    Host Port:     0/TCP
    Command:
      run-postgresql
    Args:
      -c
      max_locks_per_transaction=1024
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 09 Apr 2024 14:50:29 +0200
      Finished:     Tue, 09 Apr 2024 14:50:29 +0200
    Ready:          False
    Restart Count:  6
    Limits:
      cpu:     4
      memory:  8Gi
    Requests:
      cpu:     2
      memory:  4Gi
    Environment:
      POSTGRESQL_DATABASE:        <set to the key 'database' in secret 'awx-postgres-configuration'>  Optional: false
      POSTGRESQL_USER:            <set to the key 'username' in secret 'awx-postgres-configuration'>  Optional: false
      POSTGRESQL_PASSWORD:        <set to the key 'password' in secret 'awx-postgres-configuration'>  Optional: false
      POSTGRES_DB:                <set to the key 'database' in secret 'awx-postgres-configuration'>  Optional: false
      POSTGRES_USER:              <set to the key 'username' in secret 'awx-postgres-configuration'>  Optional: false
      POSTGRES_PASSWORD:          <set to the key 'password' in secret 'awx-postgres-configuration'>  Optional: false
      PGDATA:                     /var/lib/pgsql/data/pgdata
      POSTGRES_INITDB_ARGS:       --auth-host=scram-sha-256
      POSTGRES_HOST_AUTH_METHOD:  scram-sha-256
    Mounts:
      /var/lib/pgsql/data from postgres-15 (rw,path="data")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jfhm2 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  postgres-15:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  postgres-15-awx-postgres-15-0
    ReadOnly:   false
  kube-api-access-jfhm2:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Warning  FailedScheduling        8m28s                  default-scheduler        0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling..
  Warning  FailedScheduling        8m27s                  default-scheduler        0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling..
  Normal   Scheduled               8m25s                  default-scheduler        Successfully assigned awx/awx-postgres-15-0 to ks2.dev
  Normal   SuccessfulAttachVolume  8m15s                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-55bbd673-1b09-4df8-bad8-6e2fadc4c56d"
  Normal   Pulled                  6m50s (x5 over 8m12s)  kubelet                  Container image "quay.io/sclorg/postgresql-15-c9s:latest" already present on machine
  Normal   Created                 6m50s (x5 over 8m12s)  kubelet                  Created container postgres
  Normal   Started                 6m49s (x5 over 8m12s)  kubelet                  Started container postgres
  Warning  BackOff                 3m2s (x24 over 8m10s)  kubelet                  Back-off restarting failed container postgres in pod awx-postgres-15-0_awx(c85635e9-1125-45eb-aec8-38ffd742a948)

% kubectl logs pods/awx-postgres-15-0
mkdir: cannot create directory '/var/lib/pgsql/data/userdata': Permission denied

Is this maybe the same as in #1813 and related to #1790 ?
I do not have a custom postgres_data_dir configured.

@TheRealHaoLiu
Copy link
Member

mkdir: cannot create directory '/var/lib/pgsql/data/userdata': Permission denied

dupe of #1770 see #1770 (comment) for workaround

in next release you can use the postgres init container to initialize the PGDATA directory see #1805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants