Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Molecule leaves zombie agetty with 100% CPU load #1104

Closed
t2d opened this issue Jan 25, 2018 · 12 comments
Closed

Molecule leaves zombie agetty with 100% CPU load #1104

t2d opened this issue Jan 25, 2018 · 12 comments

Comments

@t2d
Copy link
Contributor

t2d commented Jan 25, 2018

Issue Type

  • Bug report

Molecule and Ansible details

ansible 2.4.2.0
molecule, version 2.7.0
  • Molecule installation method: pip
  • Ansible installation method: pip

Desired Behaviour

All processes started during testing should be killed afterwards.

Actual Behaviour (Bug report only)

On the host system remains an idle agetty process with 100% CPU load. I can only solve this by manually changing all the Dockerfile.j2 as proposed in moby/moby#4040 (comment)

@retr0h
Copy link
Contributor

retr0h commented Jan 25, 2018

The agetty process is running on your host OS?

@t2d
Copy link
Contributor Author

t2d commented Jan 25, 2018

Yes. I'm starting a vagrant box like this

# -*- mode: ruby -*-
# vi: set ft=ruby :

VAGRANTFILE_API_VERSION = "2"

$molecule_prep_script = <<SCRIPT
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://download.docker.com/linux/$(. /etc/os-release; echo "$ID")/gpg | sudo apt-key add -
echo "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee -a /etc/apt/sources.list.d/docker.list
sudo apt-get update
sudo apt-get install -y python-pip docker-ce
sudo pip install molecule docker-py
SCRIPT

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.ssh.insert_key = false
  
   # VirtualBox.
  config.vm.provider :virtualbox do |v|
    v.memory = 1024
    v.cpus = 3
    v.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
    v.customize ["modifyvm", :id, "--ioapic", "on"]
  end

  # Debian Stretch
  config.vm.define "stretch" do |stretch|
    stretch.vm.hostname = "stretch"
    stretch.vm.box = "debian/stretch64"
    stretch.vm.network "private_network", ip: "192.168.33.25"
  end

  # prepare for molecule
  config.vm.provision "shell", inline: $molecule_prep_script
end

and inside run molecule test on https://github.com/systemli/ansible-sshd.
As soon as the docker container runs, top says like

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 13553 root      20   0   14536   1728   1596 R  99.7  0.2   1:04.05 agetty 

And the process stays also there once molecule has finished.
Compare to moby/moby#4040

@retr0h
Copy link
Contributor

retr0h commented Jan 25, 2018

Since Molecule ships a Dockerfile that the user can control for their purposes. I suggest modifying the Dockerfile and adding the workaround from the issue you referenced.

@retr0h
Copy link
Contributor

retr0h commented Jan 29, 2018

Work-a-round can be implemented in the Dockerfile template provided by Molecule init, for users affected.

@percygrunwald
Copy link

FWIW, I resolved this by applying the suggestion from this comment. This problem affects @geerlingguy's docker-...-ansible images and I resolved it like this:

# molecule/default/Dockerfile.j2

FROM {{ item.image }}

RUN rm -f /lib/systemd/system/systemd*udev* \
  && rm -f /lib/systemd/system/getty.target

Then just set pre_build_image: false for any images that you'd like to treat:

# molecule/default/molecule.yml
...
platforms:
  - name: debian9
    image: "geerlingguy/docker-debian9-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: debian8
    image: "geerlingguy/docker-debian8-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: ubuntu1804
    image: "geerlingguy/docker-ubuntu1804-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: ubuntu1604
    image: "geerlingguy/docker-ubuntu1604-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: centos7
    image: "geerlingguy/docker-centos7-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
  - name: centos6
    image: "geerlingguy/docker-centos6-ansible:latest"
    command: ""
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: false
...

@t2d
Copy link
Contributor Author

t2d commented Feb 6, 2019

@percygrunwald Thanks for further investigating this. Have you considered doing a pull request in the @geerlingguy repos? (f.e. https://github.com/geerlingguy/docker-debian9-ansible)

@geerlingguy
Copy link
Contributor

(Just an aside, I haven't had this issue in any of my role or playbook testing using molecule...?)

@t2d
Copy link
Contributor Author

t2d commented Feb 6, 2019

I still do and switched to testing with vagrant locally.

@percygrunwald
Copy link

percygrunwald commented Feb 7, 2019

@t2d, regarding the PR in Jeff's repos, I have created an issue there to discuss it before making a PR. I'm not sure Jeff would consider a PR to fix this issue if he can't replicate it himself. The issue is here: geerlingguy/docker-ubuntu1804-ansible#9. I'm able to consistently replicate/resolve the issue with the steps I've outlined there. @t2d, maybe you can check those steps yourself to see if you're able to replicate it in the same way.

@geerlingguy, not sure if I'm being overly presumptuous but it seems that in your public repos for roles that you're only testing against one platform at a time. I only had this issue when launching >2 platforms including Debian-based instances at the same time, so if your role development workflow never launches 3 or more instances with Molecule, it may be that the issue has never presented itself on your system. Curious to see if you can replicate the issue with the steps I gave in geerlingguy/docker-ubuntu1804-ansible#9 (I believe you're on Mac OS X), but I can totally understand that this might not be a good use of your time.

I'm happy to maintain some Docker images based on Jeff's with the getty services removed and see how things go.

@geerlingguy
Copy link
Contributor

geerlingguy commented Feb 7, 2019

@percygrunwald - I control which OSes I run tests with using an environment variable, and test almost all my roles on at least Ubuntu 18 and 16, Debian 9, and CentOS 7, but test some on Debian 8, CentOS 6, and Fedora 29 as well (see the .travis.yml files in those repos).

When testing locally I just run one test like MOLECULE_DISTRO=debian9 molecule test — it’s just a style thing.

@percygrunwald
Copy link

Yeah, I took an extensive look through your repos and assumed that was your workflow. It makes total sense that if you're testing one platform per run that you have never encountered this issue.

I was trying to create a workflow where I can develop roles against all 6 OSs at the same time, since with Docker there's not really that much overhead to run it. I'm looking at ways that you can combine the "6 at once" style for local development and then the "one platform per Travis runner" model with the same Molecule config. Initially using --base-config seemed perfect, but it actually doesn't work with platforms (see #1423 (comment)).

I'm able to achieve the desired outcome using Molecule scenarios. I created a second scenario called travis that just references everything in the default scenario, but only runs against a single platform:

# molecule/travis/molecule.yml
---
dependency:
  name: galaxy
driver:
  name: docker
lint:
  name: yamllint
  options:
    config-file: molecule/default/yaml-lint.yml
platforms:
  - name: instance
    image: "geerlingguy/docker-${PLATFORM_DISTRO:-centos7}-ansible:latest"
    command: ${PLATFORM_COMMAND:-""}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
provisioner:
  name: ansible
  lint:
    name: ansible-lint
  playbooks:
    converge: ../default/playbook.yml
scenario:
  name: travis
verifier:
  name: testinfra
  directory: ../default/tests/
  lint:
    name: flake8

Then for local development against all 6 OSs I can run molecule test and for Travis, or when I want to isolate a single platform, I can run PLATFORM_DISTRO=ubuntu1804 molecule test -s travis.

percygrunwald added a commit to percygrunwald/docker-ubuntu1804-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-ubuntu1604-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-debian8-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-debian9-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-centos6-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-centos7-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-fedora27-ansible that referenced this issue Feb 7, 2019
percygrunwald added a commit to percygrunwald/docker-fedora29-ansible that referenced this issue Feb 7, 2019
@percygrunwald
Copy link

@t2d, I have created forks of Jeff's docker-...-ansible repos and have builds of the images with the fix applied. Please try them and see if they resolve your issue.

coglinev3 added a commit to coglinev3/ansible-role-ansible_container that referenced this issue Jun 7, 2020
Remove unnecessary getty and udev services that can result in high CPU
usage when using multiple containers with Molecule
(ansible/molecule#1104)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants