Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman exposes /dev/tty* to privileged containers when run as root #15878

Closed
dcermak opened this issue Sep 21, 2022 · 6 comments · Fixed by #15895
Closed

podman exposes /dev/tty* to privileged containers when run as root #15878

dcermak opened this issue Sep 21, 2022 · 6 comments · Fixed by #15895
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@dcermak
Copy link
Contributor

dcermak commented Sep 21, 2022

/kind bug

Description

Podman will expose the host's /dev/tty* to containers that were launched with --privileged and as the root user, which can lead to very nasty issues if the container is running systemd and happens to have getty enabled. Then the container will "take over" the host's /dev/tty1 and break it.

Steps to reproduce the issue:

Don't try this on your machine, it might die.

  1. sudo podman run -d --privileged --name bci-init registry.suse.com/bci/bci-init:15.4

  2. sudo podman exec bci-init systemctl start getty@tty1

  3. Depending on your exact config, getty@tty1 might fail, or it might actually start and kill your host machine.

Describe the results you received:

My Fedora 36 box crashed. My Fedora 36 VM did not crash, instead I got an error launching getty@tty1.

Describe the results you expected:

/dev/tty* must not be exposed to the container according to https://systemd.io/CONTAINER_INTERFACE/

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.4
Built:        Thu Aug 11 16:42:17 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.27.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.4-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: '
  cpuUtilization:
    idlePercent: 40.79
    systemPercent: 14.02
    userPercent: 45.18
  cpus: 12
  distribution:
    distribution: fedora
    version: "36"
  eventLogger: journald
  hostname: Boreas
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
    - container_id: 65537
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
    - container_id: 65537
      host_id: 100000
      size: 65536
  kernel: 5.19.8-200.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 818167808
  memTotal: 33325137920
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.6-2.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 4413575168
  swapTotal: 8589930496
  uptime: 193h 16m 38.00s (Approximately 8.04 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
  - registry.suse.com
store:
  configFile: /home/dan/.config/containers/storage.conf
  containerStore:
    number: 7
    paused: 0
    running: 2
    stopped: 5
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/dan/.local/share/containers/storage
  graphRootAllocated: 1022488477696
  graphRootUsed: 761035309056
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 109
  runRoot: /run/user/1000/containers
  volumePath: /home/dan/.local/share/containers/storage/volumes
version:
  APIVersion: 4.2.0
  Built: 1660228937
  BuiltTime: Thu Aug 11 16:42:17 2022
  GitCommit: ""
  GoVersion: go1.18.4
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.0

Package info (e.g. output of rpm -q podman or apt list podman):

podman-4.2.0-2.fc36.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

No

Additional environment details (AWS, VirtualBox, physical, etc.):

I have also tried this in the Fedora 36 cloud-base vagrant box using libvirt. The VM did not die, but /dev/tty* was exposed to the container as well.

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 21, 2022
@mheon
Copy link
Member

mheon commented Sep 21, 2022

I don't know if we can change this by default - --privileged must, for Docker compat, mount every device from the host into the container.

I suppose that a container in systemd mode that is also privileged can possibly mask the TTY devices (or decide not to mount them?) if it's in privileged mode, though.

@dcermak
Copy link
Contributor Author

dcermak commented Sep 21, 2022

I don't know if we can change this by default - --privileged must, for Docker compat, mount every device from the host into the container.

That's a bad idea according to Leonard Poettering: systemd/systemd#24743 (comment)

I suppose that a container in systemd mode that is also privileged can possibly mask the TTY devices (or decide not to mount them?) if it's in privileged mode, though.

Most init/systemd containers mask the getty service, but every container image has to do that. It would be better if podman would not pass the TTY into the container at all, as it will really just cause problems.

@mheon
Copy link
Member

mheon commented Sep 21, 2022

As bad of an idea as it may be, it's a 10 year old decision, and any change we make at this point could very well break existing applications.

Making containers that detect they are using systemd ad PID 1 alter this behavior is the most we can do without making a breaking change.

@dcermak
Copy link
Contributor Author

dcermak commented Sep 21, 2022

Making containers that detect they are using systemd ad PID 1 alter this behavior is the most we can do without making a breaking change.

That actually sounds like the most sensible decision to me. Can you point me to the code that would handle this case?

@mheon
Copy link
Member

mheon commented Sep 21, 2022

func AddPrivilegedDevices(g *generate.Generator) error {
handles the addition of devices to privileged containers. The container knows if it was started in systemd mode (https://github.com/containers/podman/blob/main/libpod/container_config.go#L392-L393), so adding a parameter to that function for systemd mode and then omitting tty devices in the case that it is true ought to fix it.

dcermak added a commit to dcermak/podman that referenced this issue Sep 22, 2022
According to https://systemd.io/CONTAINER_INTERFACE/, systemd will try take
control over /dev/ttyN if exported, which can cause conflicts with the host's tty
in privileged containers. Thus we will not expose these to privileged containers
in systemd mode, as this is a bad idea according to systemd's maintainers.

Additionally, this commit adds a bats regression test to check that no /dev/ttyN
are present in a privileged container in systemd mode

This fixes containers#15878

Signed-off-by: Dan Čermák <[email protected]>
@mupuf
Copy link
Contributor

mupuf commented Oct 25, 2022

While I agree with the general direction of this issue, this broke my use case... but it doesn't have to :)

The problem with this change is that on top of blocking access to virtual terminals (what Lennart Poettering called madness), you also prevent access to serial consoles: ttyACM/ttyUSB... And I need access to them even in systemd mode.

So, the solution would be to only block /dev/tty\d+.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 12, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants