Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use Kubernetes healthcheck probes without certain executables (tcpSocket/nc, httpGet/curl) inside the container #18318

Open
joelpurra opened this issue Apr 24, 2023 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. kube stale-issue

Comments

@joelpurra
Copy link
Contributor

joelpurra commented Apr 24, 2023

Issue Description

I'm moving some servers/services to podman kube play and ran in to a problem. Several (not all) servers died after a few minutes, seemingly consistent with configured probe limits, despite it being clear that the services were actually reachable and usable from clients. Disabling the health checks also meant that the service would stay up. After some digging I found the issue.

Container health check probes (startupProbe, readinessProbe, livenessProbe) with checks of kind tcpSocket or httpGet are effectively equivalent to exec checks. This is because they get converted to exec commands by podman in kube.go.

The exec conversion means executing nc to check for open TCP ports or curl to GET an HTTP URL, from inside the container. Containers which only have the bare minimum of software installed (as is best practice) may not have these "external dependencies", in which case the probes will always fail.

It is my understanding that both tcpSocket and httpGet should probe from within the pod, but not from within the particular container it probes. This places the nc/curl (or equivalent) dependency requirements on the pod manager.

Should these TCP/HTTP probe connection attempts be implemented in podman instead?

Idea: probe dependencies do not have to be direct dependencies of podman. Podman may use minimal "probe images", and delegate checks to ephemeral health check containers. This may increase flexibility and potentially allow for broader probe kind support.

Steps to reproduce the issue

  1. Start a well-configured server/service in a pod using podman kube play, where at least one container has well-configured health checks of kinds tcpSocket or httpGet.
  2. Monitor the pod to see if the container gets to the healthy state.
  3. If it does not reach the healthy state, inspect if the container/image has nc/curl (with sufficient feature support) installed in the $PATH.

Describe the results you received

Health check results depend on not only on the containerized server/service itself, but also on other software included in the container/image.

Describe the results you expected

I was under the impression that "outside" health checks, such as tcpSocket and httpGet, should not rely on health check software (which is not usually a part of the actual server/service software) within the container itself.

podman info output

host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon_2:2.1.7-0debian12+obs15.22_amd64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 97.66
    systemPercent: 0.92
    userPercent: 1.43
  cpus: 1
  databaseBackend: boltdb
  distribution:
    codename: bookworm
    distribution: debian
    version: "12"
  eventLogger: journald
  hostname: server
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.1.0-7-amd64
  linkmode: dynamic
  logDriver: journald
  memFree: 97869824
  memTotal: 1004994560
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun_101:1.8.4-0debian12+obs55.7_amd64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.4
      commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.2.0-1_amd64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 3581431808
  swapTotal: 3779063808
  uptime: 116h 22m 6.00s (Approximately 4.83 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/username/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: btrfs
  graphOptions: {}
  graphRoot: /home/username/.local/share/containers/storage
  graphRootAllocated: 31138512896
  graphRootUsed: 6853885952
  graphStatus:
    Build Version: Btrfs v6.2
    Library Version: "102"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/username/.local/share/containers/storage/volumes
version:
  APIVersion: 4.5.0
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.19.8
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Tested on:

  • Debian Testing (bookworm), running in a Proxmox VPS.
  • Ubuntu 22.10, on bare metal.

Additional information

Test cases

The Kubernetes documentation provides probe examples (referenced in kube.go) which can be executed directly with podman kube play. While monitoring podman container statuses, kube play each yaml file for at least a minute before taking it --down.

exec-liveness.yaml

The exec probe works as expected, entering the healthy state immediately and later restarting when the health check deliberately fails. Failure command output cat: can't open '/tmp/healthy': No such file or directory.

podman kube play 'https://k8s.io/examples/pods/probe/exec-liveness.yaml'
podman kube play --down 'https://k8s.io/examples/pods/probe/exec-liveness.yaml'

tcp-liveness-readiness.yaml

The tcpSocket probe never leaves the starting state, and gets restarted after several failures. There is no command output.

podman kube play 'https://k8s.io/examples/pods/probe/tcp-liveness-readiness.yaml'
podman kube play --down 'https://k8s.io/examples/pods/probe/tcp-liveness-readiness.yaml'

http-liveness.yaml

The httpGet probe never leaves the starting state, and gets restarted after several failures. There is no command output.

podman kube play 'https://k8s.io/examples/pods/probe/http-liveness.yaml'
podman kube play --down 'https://k8s.io/examples/pods/probe/http-liveness.yaml'

grpc-liveness.yaml

The grpc probe is not supported by podman, but is included here for completeness with the health check examples from Kubernetes.io.

grpc

It seems the grpc probe is ignored, and the container keeps running without a health state (starting, healthy, unhealthy, ...) in the podman ps output.

This may be used as an example of additional "outside" health check kinds, which may be separately containerized without imposing these dependencies on the podman binary itself. See gRPC health checks.

podman kube play 'https://k8s.io/examples/pods/probe/grpc-liveness.yaml'
podman kube play --down 'https://k8s.io/examples/pods/probe/grpc-liveness.yaml'

Test monitoring

Monitor the container states separately, for example either by watching podman ps "live" or by logging the podman inspect output.

# NOTE: watch status live.
watch --differences --interval 1 podman ps

# NOTE: keep a status log.
( while true; do date ; podman inspect --latest | jq '.[] | { Name, Health: .State.Health }' ; sleep 5 ; done ; )

Workarounds

  1. Install nc/curl directly in the container in an extra build step.
  2. Find a different (but equivalent) health check method which may exec directly in the container. One example may be to test for sockets created when the container/server has initialized fully: test -S /path/to/server/socket.
  3. Utilize existing container software for workarounds, perhaps script interpreters such as perl or python.

Here's an example of using bash redirections to simulate a nc -z check on localhost:8080 (TCP). Note that this workaround will send (empty) data to the server port, which may cause side-effects if the server acts on the incoming connection.

On failure the output is bash: connect: Connection refused\nbash: line 1: /dev/tcp/localhost/8080: Connection refused and exit code is non-zero.

livenessProbe:
  # TODO: replace with tcpSocket healthcheck.
  exec:
    command:
      - bash
      - "-c"
      - ": > /dev/tcp/localhost/8080"
  failureThreshold: 3
  initialDelaySeconds: 1
  periodSeconds: 5

Executing nc in common base images

The same issue arises for "simplified" command versions, such as nc in busybox which doesn't always support the -z nor -v options/features (depending on compile flags and busybox version).

podman run --rm busybox nc
BusyBox v1.22.1 (2014-05-22 23:22:11 UTC) multi-call binary.

Usage: nc [-iN] [-wN] [-l] [-p PORT] [-f FILE|IPADDR PORT] [-e PROG]

Open a pipe to IP:PORT or FILE

	-l	Listen mode, for inbound connects
		(use -ll with -e for persistent server)
	-p PORT	Local port
	-w SEC	Connect timeout
	-i SEC	Delay interval for lines sent
	-f FILE	Use file (ala /dev/ttyS0) instead of network
	-e PROG	Run PROG after connect
podman run --rm alpine nc
BusyBox v1.35.0 (2022-11-19 10:13:10 UTC) multi-call binary.

Usage: nc [OPTIONS] HOST PORT  - connect
nc [OPTIONS] -l -p PORT [HOST] [PORT]  - listen

	-e PROG	Run PROG after connect (must be last)
	-l	Listen mode, for inbound connects
	-lk	With -e, provides persistent server
	-p PORT	Local port
	-s ADDR	Local address
	-w SEC	Timeout for connects and final net reads
	-i SEC	Delay interval for lines sent
	-n	Don't do DNS resolution
	-u	UDP mode
	-b	Allow broadcasts
	-v	Verbose
	-o FILE	Hex dump traffic
	-z	Zero-I/O mode (scanning)
podman run --rm centos nc

Could not find nc in $PATH.

podman run --rm fedora nc

Could not find nc in $PATH.

podman run --rm debian nc

Could not find nc in $PATH.

podman run --rm ubuntu nc

Could not find nc in $PATH.

Executing curl in common base images

It's less common to find curl installed.

podman run --rm busybox curl

Could not find curl in $PATH.

podman run --rm alpine curl

Could not find curl in $PATH.

podman run --rm centos curl
curl: try 'curl --help' or 'curl --manual' for more information
podman run --rm fedora curl
curl: try 'curl --help' or 'curl --manual' for more information
podman run --rm debian curl

Could not find curl in $PATH.

podman run --rm ubuntu curl

Could not find curl in $PATH.


Running a personal Open Build Service (OBS) branch of podman v4.5.0 (as suggested in another issue), with a build dependency fix and added BTRFS support. I'm just starting out using OBS, but it should not affect this issue.

apt show podman
Package: podman
Version: 4:4.5.0-debian12joelpurra1+obs82.1
Priority: optional
Maintainer: Podman Debbuild Maintainers <https://github.com/orgs/containers/teams/podman-debbuild-maintainers>
Installed-Size: 73.2 MB
Provides: podman-manpages (= 4:4.5.0-debian12joelpurra1+obs82.1)
Depends: catatonit,iptables,nftables,conmon (>= 2:2.0.30),containers-common (>= 4:1),uidmap,netavark (>= 1.0.3-1),libc6,libgpg-error0
Recommends: podman-gvproxy (= 4:4.5.0-debian12joelpurra1+obs82.1)
Suggests: qemu-user-static
Homepage: https://podman.io/
Download-Size: 29.3 MB
APT-Manual-Installed: yes
APT-Sources: https://download.opensuse.org/repositories/home:/joelpurra:/branches:/devel:/kubic:/libcontainers:/unstable/Debian_Testing  Packages
Description: Manage Pods, Containers and Container Images
 podman (Pod Manager) is a fully featured container engine that is a simple
 daemonless tool.  podman provides a Docker-CLI comparable command line that
 eases the transition from other container engines and allows the management of
 pods, containers and images.  Simply put: alias docker=podman.
 Most podman commands can be run as a regular user, without requiring
 additional privileges.
 .
 podman uses Buildah(1) internally to create container images.
 Both tools share image (not container) storage, hence each can use or
 manipulate images (but not containers) created by the other.
 .
 Manage Pods, Containers and Container Images
 podman Simple management tool for pods, containers and images

N: There are 2 additional records. Please use the '-a' switch to see them.
@joelpurra joelpurra added the kind/bug Categorizes issue or PR as related to a bug. label Apr 24, 2023
@Luap99 Luap99 added the kube label Apr 24, 2023
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented May 26, 2023

@ygalblum
Copy link
Contributor

Not sure how good my idea is, but what about using nsenter -t <ContainerPID> -n -U curl/nc instead of execing into the pod? This way we only enter the network and user namespace of the pod while keeping the access to the executable on the host.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@Luap99
Copy link
Member

Luap99 commented Jun 28, 2023

Not sure how good my idea is, but what about using nsenter -t <ContainerPID> -n -U curl/nc instead of execing into the pod? This way we only enter the network and user namespace of the pod while keeping the access to the executable on the host.

In general that should be better but it still requires these deps to be installed on the host, curl and nc are definitely not installed by default on all distros so that might even cause regressions.

I think if we fix this we might as well do it properly and do not depend on external commands. Checking for a tcp port and doing a http get request can be done trivially in go. The only question would be how do we expose this in our internal healthcheck logic.

@rhatdan
Copy link
Member

rhatdan commented Jun 28, 2023

I like the idea of building some of these into the podman and not relying on external tools. Exposing is the issue.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@nogweii
Copy link

nogweii commented Nov 3, 2023

Has there been any further thought about this? I'd also like to see this functionality brought to podman run & Quadlet, so that I could easily define an HTTP health check for a container running as a systemd service. With this and #18189 together, that would be a powerful combination.

Something like podman run --health-http-probe and --health-tcp-probe? Though, that raises a new question: what should podman do when multiple probes are configured?

@viplifes
Copy link

viplifes commented Apr 4, 2024

In alpine linux curl is not installed by default. As an alternative to curl, you can use wget
https://github.com/containers/podman/blob/v4.9.3/pkg/specgen/generate/kube/kube.go#L679

curl -

commandString = fmt.Sprintf("curl -f %s://%s:%d%s || %s", uriScheme, host, portNum, path, failureCmd)

wget -

commandString = fmt.Sprintf("wget -q -O /dev/null %s://%s:%d%s || %s", uriScheme, host, portNum, path, failureCmd)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kube stale-issue
Projects
None yet
Development

No branches or pull requests

6 participants