Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

play kube fails while started by systemd #2752

Closed
ikke-t opened this issue Mar 23, 2019 · 3 comments
Closed

play kube fails while started by systemd #2752

ikke-t opened this issue Mar 23, 2019 · 3 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@ikke-t
Copy link

ikke-t commented Mar 23, 2019

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

I'm not sure if this is user ignorance (likely) or an error somewhere. I can't just point where the error is. Perhaps you can. If I start AWX from command line with podman play kube awx.yml it works perfectly. But if I put that into systemctl, all containers die in less than two seconds. Exactly same command, and every time.

Steps to reproduce the issue:

  1. Do systemct service file
cat >/etc/systemd/system/awx-container-pod.service<<EOF 
[Unit]
Description=AWX Podman Container
After=network.target

[Service]
Type=simple
TimeoutStartSec=15
ExecStartPre=-/usr/bin/podman pod rm -f awx
User=root

ExecStart=/usr/bin/podman play kube /etc/containers/awx.yaml

ExecReload=-/usr/bin/podman pod stop awx
ExecReload=-/usr/bin/podman pod rm -f awx
ExecStop=-/usr/bin/podman pod stop awx
Restart=on-failure
RestartSec=30

[Install]
WantedBy=multi-user.target
EOF

Note, I've tried that with both Type=simple|forking

  1. reload systemd units
systemctl daemon-reload
  1. create awx.yml
cat >/etc/containers/awx.yaml<<EOF 
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: awx
  name: awx
spec:
  #
  # define exported volumes for permanent data
  #
  volumes:
  - name: awx-data-volume
    hostPath:
      path: /tmp/awx_data
      type: Directory
  - name: db-volume
    hostPath:
      path: /tmp/awx_db
      type: Directory
  containers:
  #
  # postgres container
  #
  - command:
    - docker-entrypoint.sh
    - postgres
    env:
    - name: PATH
      value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/9.6/bin
    - name: POSTGRES_USER
      value: awx
    - name: POSTGRES_DB
      value: awx
    - name: PGDATA
      value: /var/lib/postgresql/data/pgdata
    - name: POSTGRES_PASSWORD
      value: awxpass
    image: docker.io/library/postgres:9.6
    name: postgres
    volumeMounts:
    - mountPath: /var/lib/postgresql/data/pgdata:z
      name: db-volume
  #
  # memcached container
  #
  - command:
    - docker-entrypoint.sh
    - memcached
    env:
    image: docker.io/library/memcached:alpine
    name: memcached
  #
  # awx-web container
  #
  - command:
    - /tini
    - --
    - /bin/sh
    - -c
    - /usr/bin/launch_awx.sh
    env:
    - name: AWX_ADMIN_USER
      value: admin
    - name: AWX_ADMIN_PASSWORD
      value: foobar
    - name: HOSTNAME
      value: awxweb
    - name: DATABASE_NAME
      value: awx
    - name: DATABASE_USER
      value: awx
    - name: DATABASE_PASSWORD
      value: awxpass
    - name: DATABASE_PORT
      value: "5432"
    - name: DATABASE_HOST
      value: 127.0.0.1
    - name: RABBITMQ_HOST
      value: 127.0.0.1
    - name: RABBITMQ_VHOST
      value: awx
    - name: RABBITMQ_USER
      value: guest
    - name: RABBITMQ_PASSWORD
      value: guest
    - name: SECRET_KEY
      value: awxsecret
    - name: RABBITMQ_PORT
      value: "5672"
    - name: MEMCACHED_HOST
      value: 127.0.0.1
    - name: MEMCACHED_PORT
      value: "11211"
    image: docker.io/ansible/awx_web:latest
    name: awxweb
    workingDir: /var/lib/awx
    volumeMounts:
    - mountPath: /var/lib/awx/projects:z
      name: awx-data-volume
    ports:
      - containerPort: 8052
        hostPort: 8052
        protocol: TCP
  #
  # awx-task container
  #
  - command:
    - /tini
    - --
    - /bin/sh
    - -c
    - /usr/bin/launch_awx_task.sh
    env:
    - name: AWX_ADMIN_USER
      value: admin
    - name: AWX_ADMIN_PASSWORD
      value: foobar
    - name: HOSTNAME
      value: awx
    - name: DATABASE_NAME
      value: awx
    - name: DATABASE_USER
      value: awx
    - name: DATABASE_PASSWORD
      value: awxpass
    - name: DATABASE_PORT
      value: "5432"
    - name: DATABASE_HOST
      value: 127.0.0.1
    - name: RABBITMQ_HOST
      value: 127.0.0.1
    - name: RABBITMQ_VHOST
      value: awx
    - name: RABBITMQ_USER
      value: guest
    - name: RABBITMQ_PASSWORD
      value: awxrabbit
    - name: SECRET_KEY
      value: awxsecret
    - name: RABBITMQ_PORT
      value: "5672"
    - name: MEMCACHED_HOST
      value: 127.0.0.1
    - name: MEMCACHED_PORT
      value: "11211"
    image: docker.io/ansible/awx_task:latest
    name: awxtask
    workingDir: /var/lib/awx
    volumeMounts:
    - mountPath: /var/lib/awx/projects:z
      name: awx-data-volume
  #
  # rabbitmq container
  #
  - command:
    - docker-entrypoint.sh
    - /bin/sh
    - -c
    - /launch.sh
    env:
    - name: PATH
      value: /opt/rabbitmq/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    - name: RABBITMQ_DEFAULT_VHOST
      value: awx
    - name: RABBITMQ_ERLANG_COOKIE
      value: cookiemonster
    - name: RABBITMQ_DEFAULT_USER
      value: guest
    - name: RABBITMQ_DEFAULT_PASS
      value: guest
    image: docker.io/ansible/awx_rabbitmq:3.7.4
    name: rabbitmq
EOF
  1. create tmp dirs for volumes
mkdir /tmp/{awx_db,awx_data}
  1. start AWX
systemctl start awx-container-pod

Describe the results you received:

All containers die in couple of seconds always.

CONTAINER ID  IMAGE                                 COMMAND               CREATED        STATUS                      PORTS                   NAMES
5313f206b6e9  docker.io/ansible/awx_rabbitmq:3.7.4  docker-entrypoint...  7 seconds ago  Exited (137) 5 seconds ago  0.0.0.0:8052->8052/tcp  rabbitmq
8031263a6309  docker.io/ansible/awx_task:latest     /tini -- /bin/sh ...  7 seconds ago  Exited (137) 5 seconds ago  0.0.0.0:8052->8052/tcp  awxtask
d55d157845bb  docker.io/ansible/awx_web:latest      /tini -- /bin/sh ...  7 seconds ago  Exited (137) 4 seconds ago  0.0.0.0:8052->8052/tcp  awxweb
fbc04f717cc6  docker.io/library/memcached:alpine    docker-entrypoint...  7 seconds ago  Exited (137) 4 seconds ago  0.0.0.0:8052->8052/tcp  memcached
c5b9667ec1ff  docker.io/library/postgres:9.6        docker-entrypoint...  7 seconds ago  Exited (137) 4 seconds ago  0.0.0.0:8052->8052/tcp  postgres
2d23c1b06e6e  k8s.gcr.io/pause:3.1                                        7 seconds ago  Exited (0) 5 seconds ago    0.0.0.0:8052->8052/tcp  1bbff0a8011d-infra

Describe the results you expected:

If I do the same command without systemd, it all works fine:

[root@ikke-fedora ~]# podman play kube /etc/containers/awx.yaml 
8c316e584aa2d6ee02bcec66d17fb56d9c839ab46603388211c4c747dcd6feb4
awx-data-volume
db-volume
3daa3648ff7255906fc397a5eaf6dc3502ed493d3c3c005501f6a273c03f2afa
e2877939e404999a94e23181c0eb8dfdf0f52bddecfcc2f1620008360c1bdcf5
4d74b767d0a9886e27cb86653d2165abf82bbbefc8c4f6074f7b9a4e47e3c6a2
aa6a50be8854b4616919a4ec8508e89441ed7d0e75e7b2691cbaf478c322522b
b94847ad40e64a728de54c2d6c8f4781978057e0c4441fe952e444a508b83188
[root@ikke-fedora ~]# podman ps -a
CONTAINER ID  IMAGE                                 COMMAND               CREATED         STATUS            PORTS                   NAMES
b94847ad40e6  docker.io/ansible/awx_rabbitmq:3.7.4  docker-entrypoint...  10 seconds ago  Up 8 seconds ago  0.0.0.0:8052->8052/tcp  rabbitmq
aa6a50be8854  docker.io/ansible/awx_task:latest     /tini -- /bin/sh ...  10 seconds ago  Up 8 seconds ago  0.0.0.0:8052->8052/tcp  awxtask
4d74b767d0a9  docker.io/ansible/awx_web:latest      /tini -- /bin/sh ...  10 seconds ago  Up 9 seconds ago  0.0.0.0:8052->8052/tcp  awxweb
e2877939e404  docker.io/library/memcached:alpine    docker-entrypoint...  10 seconds ago  Up 9 seconds ago  0.0.0.0:8052->8052/tcp  memcached
3daa3648ff72  docker.io/library/postgres:9.6        docker-entrypoint...  10 seconds ago  Up 9 seconds ago  0.0.0.0:8052->8052/tcp  postgres
58299d287206  k8s.gcr.io/pause:3.1                                        10 seconds ago  Up 9 seconds ago  0.0.0.0:8052->8052/tcp  8c316e584aa2-infra

Additional information you deem important (e.g. issue happens only occasionally):

Happens every time. I even disabled selinux for try, no difference. I don't spot anything different from journal either, execpt that pods stop.

Output of podman version:

Name        : podman
Epoch       : 2
Version     : 1.2.0
Release     : 24.dev.git0458daf.fc31

Output of podman info --debug:

[root@ikke-fedora ~]# podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.12.1
  podman version: 1.2.0-dev
host:
  BuildahVersion: 1.8-dev
  Conmon:
    package: podman-1.2.0-24.dev.git0458daf.fc31.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.12.0-dev, commit: 96168e342b58efb8f50a050aa22700ac59854f0e'
  Distribution:
    distribution: fedora
    version: "29"
  MemFree: 361078784
  MemTotal: 4068405248
  OCIRuntime:
    package: runc-1.0.0-68.dev.git6635b4f.fc29.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc6+dev
      commit: ef9132178ccc3d2775d4fb51f1e431f30cac1398-dirty
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 1
  hostname: ikke-fedora
  kernel: 4.18.16-300.fc29.x86_64
  os: linux
  rootless: false
  uptime: 12h 43m 32.82s (Approximately 0.50 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 6
  GraphDriverName: overlay
  GraphOptions:
  - overlay.mountopt=nodev
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 6
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.):

KVM guest, fedora 29 on RHEL7.6.

@openshift-ci-robot openshift-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 23, 2019
@ikke-t
Copy link
Author

ikke-t commented Mar 23, 2019

Note that normally I do run single containers fine with systemd.

@ikke-t
Copy link
Author

ikke-t commented Mar 23, 2019

another funny issue here is, that if you change that rabbitmq container anywhere earlier than last, it will die due some hostname problem. But at the end it always succeeds to run.

@ikke-t
Copy link
Author

ikke-t commented Mar 23, 2019

Damnets, it always helps to explain issue to someone, and then you find out the reason yourself. I solved this by RTFM. Adding this to systemd service file fixes it: RemainAfterExit=yes

@ikke-t ikke-t closed this as completed Mar 23, 2019
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

2 participants