Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rootless] 'podman pod' port publish intermittently fails on Ubuntu 18.04, works on Fedora 31 (with test case) #4559

Closed
digitalcircuit opened this issue Nov 23, 2019 · 9 comments · Fixed by #4592
Assignees
Labels
do-not-close kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@digitalcircuit
Copy link

digitalcircuit commented Nov 23, 2019

In short

  • Publishing a port via a rootless Podman pod intermittently fails
    • Fedora 31, VirtualBox: 1.6.2 works reliably, no failure in 10 consecutive tries
    • Ubuntu 18.04, VirtualBox: 1.6.2 works intermittently (usually)
    • Ubuntu 18.04, Hetzner VPS: 1.6.2 works intermittently (rarely)
  • Seemingly no error messages on failure
  • Semi-automatic test case included at the end

Details

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When creating a pod with a published port (e.g. --publish="8066:80") and adding several containers, Podman does not always start listening on the specified port despite allowing the containers to run with no errors from slirp4netns.

To add to the complexity, Fedora 31 x64 server seems to work reliably. Ubuntu 18.04 LTS x64 fails intermittently (about 70% chance of success in a VirtualBox VM, under 10% chance on a Hetzner Cloud VPS). There may be some difference in system configuration, but to the best of my efforts, the testing environments are set with minimal changes to a fresh, default install.

Steps to reproduce the issue:

For simpler testing, please see the semi-automatic test script provided near the end of this issue, podman-rootless-port-test-seafilebox.sh. Install updates, then run the script.

  1. Install updates
  2. Install podman, slirp4netns, and containernetworking-plugins
  3. Ensure the current user has adequate subuids/subgids
grep "$(whoami):" /etc/subuid /etc/subgid

If the above fails (should work with defaults), fix with something like

sudo usermod --add-subuids 296608-362143 --add-subgids 296608-362143 "$(whoami)"
  1. Create volume mount directories
mkdir --parents "$HOME/podman_test_mnt/seafilebox-data" "$HOME/podman_test_mnt/seafilebox-mysql/db"
  1. On Fedora 31, set the SELinux context
chcon -Rt svirt_sandbox_file_t "$HOME/podman_test_mnt/seafilebox-data"
chcon -Rt svirt_sandbox_file_t "$HOME/podman_test_mnt/seafilebox-mysql/db"
  1. Create pod and containers for Seafile
podman --log-level "debug" pod create --name="seafilebox" --publish="8066:80"
podman --log-level "debug" create --name="seafilebox-mysql" --pod="seafilebox" --env="MYSQL_ROOT_PASSWORD=mysql-root-test-pass" --env="MYSQL_LOG_CONSOLE=true" --mount type=bind,source=$HOME/podman_test_mnt/seafilebox-mysql/db,destination=/var/lib/mysql docker.io/mariadb:10.1
podman --log-level "debug" start "seafilebox-mysql"
podman --log-level "debug" create --name="seafilebox-memcached" --pod="seafilebox" docker.io/memcached:alpine memcached -m 256
podman --log-level "debug" start "seafilebox-memcached"
podman --log-level "debug" create --name="seafilebox-seafile" --pod="seafilebox" --env="DB_HOST=seafilebox" --env="DB_ROOT_PASSWD=mysql-root-test-pass" --env="TIME_ZONE=Etc/UTC" --env="[email protected]" --env="SEAFILE_ADMIN_PASSWORD=seafile-test-pass" --env="SEAFILE_SERVER_LETSENCRYPT=false" --env="SEAFILE_SERVER_HOSTNAME=example.invalid" --mount type=bind,source=$HOME/podman_test_mnt/seafilebox-data,destination=/shared docker.io/seafileltd/seafile-mc:latest
podman --log-level "debug" start "seafilebox-seafile"

First time setup: Wait for the file $HOME/podman_test_mnt/seafilebox-data/seafile/conf/seahub_settings.py to be created (signaling initial Seafile setup has completed)

  1. Check for a listening server on :8066
netstat --listen --numeric | grep ":8066"
curl localhost:8066 >/dev/null
  1. Stop and remove the pod/containers
podman --log-level "debug" pod stop "seafilebox"
podman --log-level "debug" rm "seafilebox-seafile"
podman --log-level "debug" rm "seafilebox-memcached"
podman --log-level "debug" rm "seafilebox-mysql"
podman --log-level "debug" pod rm "seafilebox"
  1. Repeat steps 6-8 ten times, tracking results of step 7

Describe the results you received:

Using the following two commands to check:

netstat --listen --numeric | grep ":8066"
curl localhost:$PODMAN_BIND_PORT >/dev/null

After repeating the test ten times as shown above, this results in the following success rates:

  • Fedora 31 server, VirtualBox
    • 100% success, 10/10
  • Ubuntu 18.04 LTS, VirtualBox
    • 70% success, 7/10 (varies with each test run)
  • Ubuntu 18.04 LTS, Hetzner Cloud VPS
    • 0% success, 0/10 (also varies with each test run)
    • Note: this had worked in the past, just at about 10-20% success rate.

Describe the results you expected:

100% success at binding to port 8066 and responding to requests for all systems.

Additional information you deem important:

As noted above, this happens intermittantly and varies between a VirtualBox VM setup and a Hetzner Cloud VPS. However, regardless of rate, it eventually fails on Ubuntu 18.04.

Changes made:

  • Ubuntu 18.04
    • Updates installed
    • OpenSSH installed with authorized_keys set
    • Podman installed
sudo add-apt-repository --yes "ppa:projectatomic/ppa"
sudo apt update
sudo apt install --yes podman slirp4netns uidmap containernetworking-plugins
  • Fedora 31
    • Updates installed
    • OpenSSH enabled with authorized_keys set
    • Podman installed
sudo dnf install --assumeyes podman slirp4netns containernetworking-plugins

Output of podman version:

Fedora 31 podman version

Version:            1.6.2
RemoteAPI Version:  1
Go Version:         go1.13.1
OS/Arch:            linux/amd64

Ubuntu 18.04 VirtualBox & Hetzner VPS podman version

Version:            1.6.2
RemoteAPI Version:  1
Go Version:         go1.10.4
OS/Arch:            linux/amd64

Output of podman info --debug:

Fedora 31 VirtualBox podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.13.1
  podman version: 1.6.2
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v2
  Conmon:
    package: conmon-2.0.2-1.fc31.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.2, commit: 186a550ba0866ce799d74006dab97969a2107979'
  Distribution:
    distribution: fedora
    version: "31"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 556642304
  MemTotal: 2083835904
  OCIRuntime:
    name: crun
    package: crun-0.10.6-1.fc31.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.10.6
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  SwapFree: 1068744704
  SwapTotal: 1073737728
  arch: amd64
  cpus: 2
  eventlogger: journald
  hostname: localhost.localdomain
  kernel: 5.3.11-300.fc31.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.0-20.1.dev.gitbbd6f25.fc31.x86_64
    Version: |-
      slirp4netns version 0.4.0-beta.3+dev
      commit: bbd6f25c70d5db2a1cd3bfb0416a8db99a75ed7e
  uptime: 19m 31.11s
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - quay.io
store:
  ConfigFile: /home/podman/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7-1.fc31.x86_64
      Version: |-
        fusermount3 version: 3.6.2
        fuse-overlayfs: version 0.7
        FUSE library version 3.6.2
        using FUSE kernel interface version 7.29
  GraphRoot: /home/podman/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 4
  RunRoot: /run/user/1000
  VolumePath: /home/podman/.local/share/containers/storage/volumes
Ubuntu 18.04 VirtualBox podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.10.4
  podman version: 1.6.2
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v1
  Conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.3, commit: unknown'
  Distribution:
    distribution: ubuntu
    version: "18.04"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  MemFree: 1021825024
  MemTotal: 2090123264
  OCIRuntime:
    name: runc
    package: 'cri-o-runc: /usr/lib/cri-o-runc/sbin/runc'
    path: /usr/lib/cri-o-runc/sbin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 974364672
  SwapTotal: 993239040
  arch: amd64
  cpus: 2
  eventlogger: journald
  hostname: ubuntu
  kernel: 4.15.0-70-generic
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: 'slirp4netns: /usr/bin/slirp4netns'
    Version: |-
      slirp4netns version 0.4.2
      commit: unknown
  uptime: 22m 48.28s
registries:
  blocked: null
  insecure: null
  search: null
store:
  ConfigFile: /home/podman/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: vfs
  GraphOptions: {}
  GraphRoot: /home/podman/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 4
  RunRoot: /run/user/1000
  VolumePath: /home/podman/.local/share/containers/storage/volumes
Note: upgrading Ubuntu to the latest Hardware Enablement Stack (HWE) to get kernel 5.0.0.36.94 did not seem to fix the issue.
Ubuntu 18.04 Hetzner VPS podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.10.4
  podman version: 1.6.2
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v1
  Conmon:
    package: 'conmon: /usr/bin/conmon'
    path: /usr/bin/conmon
    version: 'conmon version 2.0.3, commit: unknown'
  Distribution:
    distribution: ubuntu
    version: "18.04"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  MemFree: 887091200
  MemTotal: 2039902208
  OCIRuntime:
    name: runc
    package: 'cri-o-runc: /usr/lib/cri-o-runc/sbin/runc'
    path: /usr/lib/cri-o-runc/sbin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 1
  eventlogger: journald
  hostname: ubuntu-hetzner
  kernel: 4.15.0-70-generic
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: 'slirp4netns: /usr/bin/slirp4netns'
    Version: |-
      slirp4netns version 0.4.2
      commit: unknown
  uptime: 18m 33.4s
registries:
  blocked: null
  insecure: null
  search: null
store:
  ConfigFile: /home/podman/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: vfs
  GraphOptions: {}
  GraphRoot: /home/podman/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 4
  RunRoot: /run/user/1000
  VolumePath: /home/podman/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

Fedora 31, VirtualBox

podman-1.6.2-2.fc31.x86_64

Ubuntu 18.04, VirtualBox & Hetzner VPS

podman/bionic,now 1.6.2-1~ubuntu18.04~ppa1 amd64 [installed]

Additional environment details (AWS, VirtualBox, physical, etc.):

VirtualBox version 6.0.14 r133895 (Qt5.6.1) running on Ubuntu 16.04 LTS x64.

Additional testing done with a Hetzner Cloud VPS (where issue was first found).

Semi-automatic test suite

Setup

  1. Install all updates
  2. Save the script at the end of the steps
    podman-rootless-port-test-seafilebox.sh
  3. Run the script, following the prompts
chmod +x podman-rootless-port-test-seafilebox.sh
./podman-rootless-port-test-seafilebox.sh
  1. See the output in the latest results file
    • E.g. log-2019-11-23--01-44-results.log

Updated 2019-11-24 06:34:39+00:00 - add pod-less mode!

Run like this

chmod +x podman-rootless-port-test-seafilebox.sh
PODMAN_USE_POD_GROUP=false ./podman-rootless-port-test-seafilebox.sh
Script source: podman-rootless-port-test-seafilebox.sh
#!/bin/bash
# See http://redsymbol.net/articles/unofficial-bash-strict-mode/
set -euo pipefail

# Podman rootless pod port publishing test case
# Version history:
# 0.2:
#    * Add pod-less test mode, env "PODMAN_USE_POD_GROUP=false"
# 0.1:
#    * Initial version
# Licensed under CC-0 - https://creativecommons.org/choose/zero/
# Shane Synan, 2019

# Fetch distribution information
OS="unknown"
if [ -f /etc/os-release ]; then
	. /etc/os-release
	OS=$ID
fi

# [Settings]
# Test mode
PODMAN_USE_POD_GROUP="${PODMAN_USE_POD_GROUP:-true}"
# > true (default):  Group containers within a pod
# > false:           Stand up individual containers without pods
# Test amount
TEST_COUNT=10
# Logging
LOG_FILE_DATE="$(date --utc '+%F--%H-%M')"
LOG_BEFORE_FILE="log-$LOG_FILE_DATE-A-before.log"
LOG_AFTER_FILE="log-$LOG_FILE_DATE-B-after.log"
LOG_RESULTS_FILE="log-$LOG_FILE_DATE-results.log"
# Bind mount
PODMAN_MNT_DIR="$HOME/podman_test_mnt"
PODMAN_MNT_DIR_SEAFILE="$PODMAN_MNT_DIR/seafilebox-data"
PODMAN_MNT_DIR_MYSQL="$PODMAN_MNT_DIR/seafilebox-mysql/db"
# > Seahub configuration file
PODMAN_MNT_DIR_SEAFILE_CONFIGURED="$PODMAN_MNT_DIR_SEAFILE/seafile/conf/seahub_settings.py"
# Port
PODMAN_BIND_PORT="8066"
# Log level
PODMAN_LOG_LEVEL="debug"
# debug
# error - default
## SELinux labeling
## See https://prefetch.net/blog/2017/09/30/using-docker-volumes-on-selinux-enabled-servers/
##PODMAN_SELINUX_VOLUME_OPTION=":Z"
# This doesn't seem to work with Podman, just manually set context instead.
# ---

# Podman rootless port bind test case
echotime ()
{
	echo "$(date --utc --rfc-3339=seconds) $*"
}

dump_sys_info ()
{
	echo "sysctl --all"
	sudo sysctl --all
	echo "---"
	if [[ $OS == "ubuntu" ]]; then
		echo "dpkg --get-selections"
		dpkg --get-selections
	elif [[ $OS == "fedora" ]]; then
		echo "dnf list installed"
		dnf list installed
	else
		echo "[skipping package list, unknown OS]"
	fi
}

podman_launch_seafile ()
{
	if [[ "$PODMAN_USE_POD_GROUP" == "true" ]]; then
		echotime "Starting pod group..."
		podman_launch_seafile_pod
	else
		echotime "Starting ungroupped containers..."
		podman_launch_seafile_ungrouped
	fi

	if [[ "$PODMAN_USE_POD_GROUP" == "true" ]]; then
		# Wait for the initial configuration to complete
		if [ ! -f "$PODMAN_MNT_DIR_SEAFILE_CONFIGURED" ]; then
			echotime "Waiting for configuration (expect up to 5 minutes)"
			while [ ! -f "$PODMAN_MNT_DIR_SEAFILE_CONFIGURED" ]; do
				sleep 5
				echo -n "."
			done
			echo
		fi
	else
		# Don't wait, do nothing
		:
		# FIXME: Figure out container<->host<->container networking to allow
		# Seafile to access MySQL and memcached.
		#
		# Even without waiting for initial setup, Podman still appears to
		# successfully be intermittent at publishing/binding ports.
		#
		# Despite following both...
		# https://www.redhat.com/sysadmin/container-networking-podman
		# ...and...
		# https://hacklog.in/understand-podman-networking/
		# ...this didn't seem to work.
	fi
	# Wait a short while for nginx inside container to be ready
	# Might not be needed
	echotime "Waiting for container startup..."
	sleep 30s
}

podman_launch_seafile_pod ()
{
	# Adapted from https://download.seafile.com/published/seafile-manual/docker/deploy%20seafile%20with%20docker.md
	# And https://download.seafile.com/d/320e8adf90fa43ad8fee/files/?p=/docker/docker-compose.yml
	#
	# Pod
	echotime "Creating pod 'seafilebox'"
	podman --log-level "$PODMAN_LOG_LEVEL" pod create --name="seafilebox" --publish="$PODMAN_BIND_PORT:80"
	#
	# MySQL
	echotime "Creating container 'seafilebox-mysql'"
	podman --log-level "$PODMAN_LOG_LEVEL" create --name="seafilebox-mysql" --pod="seafilebox" --env="MYSQL_ROOT_PASSWORD=mysql-root-test-pass" --env="MYSQL_LOG_CONSOLE=true" --mount type=bind,source=$PODMAN_MNT_DIR_MYSQL,destination=/var/lib/mysql docker.io/mariadb:10.1
	echotime "Starting container 'seafilebox-mysql'"
	podman --log-level "$PODMAN_LOG_LEVEL" start "seafilebox-mysql"
	#
	# Memcached
	echotime "Creating container 'seafilebox-memcached'"
	podman --log-level "$PODMAN_LOG_LEVEL" create --name="seafilebox-memcached" --pod="seafilebox" docker.io/memcached:alpine memcached -m 256
	echotime "Starting container 'seafilebox-memcached'"
	podman --log-level "$PODMAN_LOG_LEVEL" start "seafilebox-memcached"
	#
	# Seafile
	echotime "Creating container 'seafilebox-seafile'"
	podman --log-level "$PODMAN_LOG_LEVEL" create --name="seafilebox-seafile" --pod="seafilebox" --env="DB_HOST=seafilebox" --env="DB_ROOT_PASSWD=mysql-root-test-pass" --env="TIME_ZONE=Etc/UTC" --env="[email protected]" --env="SEAFILE_ADMIN_PASSWORD=seafile-test-pass" --env="SEAFILE_SERVER_LETSENCRYPT=false" --env="SEAFILE_SERVER_HOSTNAME=example.invalid" --mount type=bind,source=$PODMAN_MNT_DIR_SEAFILE,destination=/shared docker.io/seafileltd/seafile-mc:latest
	echotime "Starting container 'seafilebox-seafile'"
	podman --log-level "$PODMAN_LOG_LEVEL" start "seafilebox-seafile"
}

podman_launch_seafile_ungrouped ()
{
	# Adapted from https://download.seafile.com/published/seafile-manual/docker/deploy%20seafile%20with%20docker.md
	# And https://download.seafile.com/d/320e8adf90fa43ad8fee/files/?p=/docker/docker-compose.yml
	#
	# MySQL
	echotime "Creating container 'seafilebox-mysql'"
	podman --log-level "$PODMAN_LOG_LEVEL" create --name="seafilebox-mysql" --publish="3306:3306" --env="MYSQL_ROOT_PASSWORD=mysql-root-test-pass" --env="MYSQL_LOG_CONSOLE=true" --mount type=bind,source=$PODMAN_MNT_DIR_MYSQL,destination=/var/lib/mysql docker.io/mariadb:10.1
	echotime "Starting container 'seafilebox-mysql'"
	podman --log-level "$PODMAN_LOG_LEVEL" start "seafilebox-mysql"
	#
	# Memcached
	echotime "Creating container 'seafilebox-memcached'"
	podman --log-level "$PODMAN_LOG_LEVEL" create --name="seafilebox-memcached" --publish="11211:11211" docker.io/memcached:alpine memcached -m 256
	echotime "Starting container 'seafilebox-memcached'"
	podman --log-level "$PODMAN_LOG_LEVEL" start "seafilebox-memcached"
	#
	# Get IP address
	# See https://www.redhat.com/sysadmin/container-networking-podman
	# And https://hacklog.in/understand-podman-networking/
	# And https://stackoverflow.com/questions/13322485/how-to-get-the-primary-ip-address-of-the-local-machine-on-linux-and-os-x
	local SYSTEM_IP="$(hostname --all-ip-addresses)"
	#
	# Allow access to the host network with "-P"/"--publish-all"
	# This is a potential security risk, but simplifies networking for the sake of testing
	# See https://www.redhat.com/sysadmin/container-networking-podman
	#
	# Seafile
	echotime "Creating container 'seafilebox-seafile'"
	podman --log-level "$PODMAN_LOG_LEVEL" create --name="seafilebox-seafile" --publish="$PODMAN_BIND_PORT:80" --env="DB_HOST=$SYSTEM_IP" --env="DB_ROOT_PASSWD=mysql-root-test-pass" --env="TIME_ZONE=Etc/UTC" --env="[email protected]" --env="SEAFILE_ADMIN_PASSWORD=seafile-test-pass" --env="SEAFILE_SERVER_LETSENCRYPT=false" --env="SEAFILE_SERVER_HOSTNAME=example.invalid" --mount type=bind,source=$PODMAN_MNT_DIR_SEAFILE,destination=/shared docker.io/seafileltd/seafile-mc:latest
	echotime "Starting container 'seafilebox-seafile'"
	podman --log-level "$PODMAN_LOG_LEVEL" start "seafilebox-seafile"
}

podman_stop_seafile ()
{
	if [[ "$PODMAN_USE_POD_GROUP" == "true" ]]; then
		echotime "Stopping pod group..."
		podman_stop_seafile_pod
	else
		echotime "Stopping ungroupped containers..."
		podman_stop_seafile_ungrouped
	fi

	# Wait a short while for containers to stop
	# Might not be needed
	echotime "Waiting for container stop..."
	sleep 5s
	## Work around bug https://github.com/containers/libpod/issues/3222
	#podman stop -a; kill -9 $(cat $XDG_RUNTIME_DIR/libpod/pause.pid); rm $XDG_RUNTIME_DIR/libpod/pause.pid
	# This doesn't seem to be the issue, either.
}

podman_stop_seafile_pod ()
{
	# Stop
	podman --log-level "$PODMAN_LOG_LEVEL" pod stop "seafilebox" || echo "Pod 'seafilebox' unavailable"
	# Remove
	podman --log-level "$PODMAN_LOG_LEVEL" rm "seafilebox-seafile" || echo "Container 'seafilebox-seafile' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" rm "seafilebox-memcached" || echo "Container 'seafilebox-seafile' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" rm "seafilebox-mysql" || echo "Container 'seafilebox-mysql' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" pod rm "seafilebox" || echo "Could not remove pod 'seafilebox'"
}

podman_stop_seafile_ungrouped ()
{
	# Stop
	podman --log-level "$PODMAN_LOG_LEVEL" stop "seafilebox-seafile" || echo "Container 'seafilebox-seafile' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" stop "seafilebox-memcached" || echo "Container 'seafilebox-seafile' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" stop "seafilebox-mysql" || echo "Container 'seafilebox-mysql' unavailable"
	# Remove
	podman --log-level "$PODMAN_LOG_LEVEL" rm "seafilebox-seafile" || echo "Container 'seafilebox-seafile' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" rm "seafilebox-memcached" || echo "Container 'seafilebox-seafile' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" rm "seafilebox-mysql" || echo "Container 'seafilebox-mysql' unavailable"
	podman --log-level "$PODMAN_LOG_LEVEL" pod rm "seafilebox" || echo "Could not remove pod 'seafilebox'"
}

podman_check_seafile_listening ()
{
	if ! netstat --listen --numeric | grep ":$PODMAN_BIND_PORT"; then
		echotime "Nothing listening on port :$PODMAN_BIND_PORT" >&2
		return 1
	fi

	if [[ "$PODMAN_USE_POD_GROUP" != "true" ]]; then
		# Don't check curl, assume success if port is listening
		return 0
		# FIXME: Figure out container<->host<->container networking to allow
		# Seafile to access MySQL and memcached.
	fi

	if ! curl localhost:$PODMAN_BIND_PORT >/dev/null; then
		echotime "No valid response on localhost:$PODMAN_BIND_PORT" >&2
		return 1
	fi
	return 0
}

echotime "Test case for Podman:"
echo "1.  Install updates and reboot - do that before running this script"
echo "2.  Gather debugging information"
echo "3.  Install podman and dependencies"
echo "4.  Ensure current user has adequate subuids/subgids"
echo "5.  Set up a directory for bind mounts"
echo "6.  Create a pod and containers for Seafile"
echo "7.  Check for a listening server on :$PODMAN_BIND_PORT"
echo "8.  Stop and remove the pod/containers"
echo "9.  Repeat steps 6-8 several times, tracking results"
echo "10. Gather debugging information"
echo "11. Further manual troubleshooting: try rebooting, etc"
echo
echo
if [[ "$PODMAN_USE_POD_GROUP" == "true" ]]; then
	echo "Working with pod group"
else
	echo "Working with ungroupped containers"
fi
echo
echo "Press  to begin, Ctrl-C to cancel."
read

# Step 2
echotime "2.  Gather debugging information"
echo "System information" >> "$LOG_BEFORE_FILE"
dump_sys_info >> "$LOG_BEFORE_FILE"

# Step 3
echotime "3.  Install podman and dependencies"
if [[ $OS == "ubuntu" ]]; then
	sudo add-apt-repository --yes "ppa:projectatomic/ppa"
	sudo apt update
	sudo apt install --yes podman slirp4netns uidmap containernetworking-plugins
	# sudo apt install runc #..?
elif [[ $OS == "fedora" ]]; then
	sudo dnf install --assumeyes podman slirp4netns containernetworking-plugins
	# uidmap seems to be installed by default
else
	echo "[skipping package install, unknown OS]"
fi

# Step 4
echotime "4.  Ensure current user has adequate subuids/subgids"
if ! grep --quiet "$(whoami):" /etc/subuid /etc/subgid ; then
	# If not, run something like this
	sudo usermod --add-subuids 296608-362143 --add-subgids 296608-362143 "$(whoami)"
fi

# Step 5
echotime "5.  Set up a directory for bind mounts"
mkdir --parents "$PODMAN_MNT_DIR_SEAFILE" "$PODMAN_MNT_DIR_MYSQL"
if command -v selinuxenabled >/dev/null 2>&1; then
	if selinuxenabled ; then
		echotime "Set up SELinux contexts for bind mounts"
		chcon -Rt svirt_sandbox_file_t "$PODMAN_MNT_DIR_SEAFILE"
		chcon -Rt svirt_sandbox_file_t "$PODMAN_MNT_DIR_MYSQL"
	fi
fi

echotime "Start measurements" >> "$LOG_RESULTS_FILE"

# Step 9
echotime "9.  Repeat steps 6-8 several times, tracking results"
for ((iteration=1; iteration<=TEST_COUNT; iteration++)); do
	echo "### ### ### ### ### ###"
	echotime "[$iteration/$TEST_COUNT] Performing steps 6-8"
	echo "--- --- --- --- --- ---"

	# Step 6
	echotime "[$iteration/$TEST_COUNT] 6.  Create a pod and containers for Seafile"
	podman_launch_seafile

	# Step 7
	echotime "[$iteration/$TEST_COUNT] 7.  Check for a listening server on :$PODMAN_BIND_PORT"
	if podman_check_seafile_listening; then
		echotime "[$iteration/$TEST_COUNT] Local server listening - hooray!"
		echotime "[$iteration/$TEST_COUNT] Local server listening - hooray!" >> "$LOG_RESULTS_FILE"
	else
		echotime "[$iteration/$TEST_COUNT] [!] No listening local server"
		echotime "[$iteration/$TEST_COUNT] [!] No listening local server" >> "$LOG_RESULTS_FILE"
	fi

	# Step 8
	echotime "[$iteration/$TEST_COUNT] 8.  Stop and remove the pod/containers"
	podman_stop_seafile
	echo "### ### ### ### ### ###"
done

echotime "Stop measurements" >> "$LOG_RESULTS_FILE"

# Step 10
echotime "10. Gather debugging information"
echo "System information" >> "$LOG_AFTER_FILE"
dump_sys_info >> "$LOG_AFTER_FILE"

# Step 11
echotime "11. Further manual troubleshooting: try rebooting, etc"

echo "...done!"

Results

Fedora 31, VirtualBox

log-2019-11-23--03-52-results.log

2019-11-23 03:53:58+00:00 Start measurements
2019-11-23 03:55:45+00:00 [1/10] Local server listening - hooray!
2019-11-23 03:56:28+00:00 [2/10] Local server listening - hooray!
2019-11-23 03:57:10+00:00 [3/10] Local server listening - hooray!
2019-11-23 03:57:53+00:00 [4/10] Local server listening - hooray!
2019-11-23 03:58:37+00:00 [5/10] Local server listening - hooray!
2019-11-23 03:59:19+00:00 [6/10] Local server listening - hooray!
2019-11-23 04:00:01+00:00 [7/10] Local server listening - hooray!
2019-11-23 04:00:44+00:00 [8/10] Local server listening - hooray!
2019-11-23 04:01:27+00:00 [9/10] Local server listening - hooray!
2019-11-23 04:02:10+00:00 [10/10] Local server listening - hooray!
2019-11-23 04:02:20+00:00 Stop measurements

Command line output, including podman --log-level debug

System details before, sysctl dump and package list

System details after, sysctl dump and package list

Ubuntu 18.04, VirtualBox

log-2019-11-23--03-14-results.log

2019-11-23 03:14:49+00:00 Start measurements
2019-11-23 03:17:28+00:00 [1/10] Local server listening - hooray!
2019-11-23 03:18:35+00:00 [2/10] Local server listening - hooray!
2019-11-23 03:19:45+00:00 [3/10] [!] No listening local server
2019-11-23 03:20:47+00:00 [4/10] Local server listening - hooray!
2019-11-23 03:21:43+00:00 [5/10] Local server listening - hooray!
2019-11-23 03:22:37+00:00 [6/10] Local server listening - hooray!
2019-11-23 03:23:33+00:00 [7/10] [!] No listening local server
2019-11-23 03:24:28+00:00 [8/10] Local server listening - hooray!
2019-11-23 03:25:20+00:00 [9/10] [!] No listening local server
2019-11-23 03:26:17+00:00 [10/10] Local server listening - hooray!
2019-11-23 03:26:28+00:00 Stop measurements

Command line output, including podman --log-level debug

System details before, sysctl dump and package list

System details after, sysctl dump and package list

Ubuntu 18.04, Hetzner Cloud VPS

log-2019-11-23--03-47-results.log

2019-11-23 03:47:48+00:00 Start measurements
2019-11-23 03:50:38+00:00 [1/10] [!] No listening local server
2019-11-23 03:51:35+00:00 [2/10] [!] No listening local server
2019-11-23 03:52:34+00:00 [3/10] [!] No listening local server
2019-11-23 03:53:31+00:00 [4/10] [!] No listening local server
2019-11-23 03:54:30+00:00 [5/10] [!] No listening local server
2019-11-23 03:55:30+00:00 [6/10] [!] No listening local server
2019-11-23 03:56:29+00:00 [7/10] [!] No listening local server
2019-11-23 03:57:27+00:00 [8/10] [!] No listening local server
2019-11-23 03:58:26+00:00 [9/10] [!] No listening local server
2019-11-23 03:59:24+00:00 [10/10] [!] No listening local server
2019-11-23 03:59:34+00:00 Stop measurements

Note: this had worked in the past, but it is quite rare.

Command line output, including podman --log-level debug

System details before, sysctl dump and package list

System details after, sysctl dump and package list

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 23, 2019
@rhatdan
Copy link
Member

rhatdan commented Nov 23, 2019

@AkihiroSuda Any ideas?

@AkihiroSuda
Copy link
Collaborator

Is this specific to pods?

@digitalcircuit
Copy link
Author

digitalcircuit commented Nov 23, 2019

Thank you for taking a look! Despite many fruitless hours of experimenting, I hadn't considered trying rewriting to not make use of inter-pod networking, whoops.

As soon as possible, I'll edit the test script to not create a pod and see what happens. In hindsight, I should have also tested the Ubuntu 19.10 server image (with and/or without pod) just in case it's fixed there.

Edit: the Podman PPA does not have packages for Eoan/19.10.

(With my current understanding of Podman, that should involve removing the pod, changing specified hosts in the containers, then adding the needed --publish declarations to each container, e.g. seafilebox-seafile with the same 8066:80 mapping, seafilebox-mysql with the MySQL port, and seafilebox-memcached exposing the memcached port.)

@digitalcircuit
Copy link
Author

@AkihiroSuda This does not appear to be specific to pods.

I have modified the test script to provide a mode using ungrouped containers (PODMAN_USE_POD_GROUP set to true or false, meaning no pod specified, needed ports individually published), and even while still manually running commands to test, the first port bind (MySQL to 3306) worked, and the second one (memcached to 11211) failed the first time, but worked the second time after stopping/removing and re-adding/starting the container.

I did this testing on Ubuntu 18.04 LTS x64 in VirtualBox, skipping testing the other two situations (Fedora and Hetzner Cloud VPS) as it didn't seem like it'd differ. I can test those as well if it would be helpful.

Pod-less test commands

Start

# Create directories/etc, as per the original issue steps
# This includes...
mkdir --parents "$HOME/podman_test_mnt/seafilebox-data" "$HOME/podman_test_mnt/seafilebox-mysql/db"
#
# MySQL
podman --log-level "debug" create --name="seafilebox-mysql" --publish="3306:3306" --env="MYSQL_ROOT_PASSWORD=mysql-root-test-pass" --env="MYSQL_LOG_CONSOLE=true" --mount type=bind,source=$HOME/podman_test_mnt/seafilebox-mysql/db,destination=/var/lib/mysql docker.io/mariadb:10.1
podman --log-level "debug" start "seafilebox-mysql"
#
# Memcached
podman --log-level "debug" create --name="seafilebox-memcached" --publish="11211:11211" docker.io/memcached:alpine memcached -m 256
podman --log-level "debug" start "seafilebox-memcached"
#
# Get IP address
# See https://www.redhat.com/sysadmin/container-networking-podman
# And https://hacklog.in/understand-podman-networking/
# And https://stackoverflow.com/questions/13322485/how-to-get-the-primary-ip-address-of-the-local-machine-on-linux-and-os-x
SYSTEM_IP="$(hostname --all-ip-addresses)"
#
# Seafile
podman --log-level "debug" create --name="seafilebox-seafile" --publish="8066:80" --env="DB_HOST=$SYSTEM_IP" --env="DB_ROOT_PASSWD=mysql-root-test-pass" --env="TIME_ZONE=Etc/UTC" --env="[email protected]" --env="SEAFILE_ADMIN_PASSWORD=seafile-test-pass" --env="SEAFILE_SERVER_LETSENCRYPT=false" --env="SEAFILE_SERVER_HOSTNAME=example.invalid" --mount type=bind,source=$HOME/podman_test_mnt/seafilebox-data,destination=/shared docker.io/seafileltd/seafile-mc:latest
podman --log-level "debug" start "seafilebox-seafile"

Check

netstat --listen --numeric | grep ":8066"

Stop

# Stop
podman --log-level "debug" stop "seafilebox-seafile"
podman --log-level "debug" stop "seafilebox-memcached"
podman --log-level "debug" stop "seafilebox-mysql"
# Remove
podman --log-level "debug" rm "seafilebox-seafile"
podman --log-level "debug" rm "seafilebox-memcached"
podman --log-level "debug" rm "seafilebox-mysql"
podman --log-level "debug" pod rm "seafilebox"

Updated podman-rootless-port-test-seafilebox.sh command

Update your local copy of the script according to the source in the first comment of this issue.

PODMAN_USE_POD_GROUP=false ./podman-rootless-port-test-seafilebox.sh

log-2019-11-24--06-22-results.log

2019-11-24 06:22:31+00:00 Start measurements
2019-11-24 06:23:22+00:00 [1/10] Local server listening - hooray!
2019-11-24 06:24:20+00:00 [2/10] Local server listening - hooray!
2019-11-24 06:25:09+00:00 [3/10] Local server listening - hooray!
2019-11-24 06:25:58+00:00 [4/10] [!] No listening local server
2019-11-24 06:26:48+00:00 [5/10] Local server listening - hooray!
2019-11-24 06:27:40+00:00 [6/10] [!] No listening local server
2019-11-24 06:28:31+00:00 [7/10] Local server listening - hooray!
2019-11-24 06:29:21+00:00 [8/10] Local server listening - hooray!
2019-11-24 06:30:11+00:00 [9/10] Local server listening - hooray!
2019-11-24 06:31:02+00:00 [10/10] [!] No listening local server
2019-11-24 06:31:12+00:00 Stop measurements

Aside on incomplete nature of pod-less setup

I haven't yet figured out how to get the containers to talk to each other in this way, as the guide from Red Hat and another guide for Podman networking, including not using pods don't appear to work. However, Podman still publishes/fails to publish the port regardless, so I can confirm that port publishing/binding is still intermittent without pods on Ubuntu 18.04 in VirtualBox.

I wouldn't be surprised if I'm doing things incorrectly as this is my very first experience with any sort of container system, let alone Podman. I do appreciate all the effort towards getting rootless containers working across distributions!

@mheon
Copy link
Member

mheon commented Nov 26, 2019

@giuseppe @AkihiroSuda Any ideas here? Does it look like we're looking at a Slirp issue, or something in Podman itself?

@github-actions
Copy link

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

@digitalcircuit
Copy link
Author

I've been waiting to retest until the upcoming RootlessKit port forwarder pull request is merged, hence leaving this comment to avoid having this issue closed.

If desired, I can retest before the RootlessKit PR is merged.

@AkihiroSuda
Copy link
Collaborator

Thanks, testing is highly appreciated

AkihiroSuda added a commit to AkihiroSuda/libpod that referenced this issue Jan 8, 2020
RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder:

* Very high throughput.
  Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps
  (https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377)

* Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace.
  No UDP issue (containers#4586)

* No tcp_rmem issue (containers#4537)

* Probably works with IPv6. Even if not, it is trivial to support IPv6.  (containers#4311)

* Easily extensible for future support of SCTP

* Easily extensible for future support of `lxc-user-nic` SUID network

RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby,
and no issue has been reported AFAIK.

As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman.

Fix containers#4586
May-fix containers#4559
Fix containers#4537
May-fix containers#4311

See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go

Signed-off-by: Akihiro Suda <[email protected]>
@digitalcircuit
Copy link
Author

I've updated the original issue comment's test script podman-rootless-port-test-seafilebox.sh to use the new Podman install.md#ubuntu instructions.

With the new rootless networking backend, this works 10 out of 10 times before rebooting, and 8 out of 10 times after rebooting.

podman@ubuntu:~$ tail -f log-2020-01-28--22-57-results.log 
2020-01-28 22:57:41+00:00 Start measurements
2020-01-28 23:01:35+00:00 [1/10] Local server listening - hooray!
2020-01-28 23:02:37+00:00 [2/10] Local server listening - hooray!
2020-01-28 23:03:44+00:00 [3/10] Local server listening - hooray!
2020-01-28 23:04:45+00:00 [4/10] Local server listening - hooray!
2020-01-28 23:05:45+00:00 [5/10] Local server listening - hooray!
2020-01-28 23:06:38+00:00 [6/10] Local server listening - hooray!
2020-01-28 23:07:36+00:00 [7/10] Local server listening - hooray!
2020-01-28 23:08:34+00:00 [8/10] Local server listening - hooray!
2020-01-28 23:09:26+00:00 [9/10] Local server listening - hooray!
2020-01-28 23:10:19+00:00 [10/10] Local server listening - hooray!
2020-01-28 23:10:30+00:00 Stop measurements

[Reboot, repeat tests]

podman@ubuntu:~$  tail -f log-2020-01-28--23-16-results.log 
2020-01-28 23:16:42+00:00 Start measurements
2020-01-28 23:17:27+00:00 [1/10] [!] No listening local server
2020-01-28 23:18:19+00:00 [2/10] Local server listening - hooray!
2020-01-28 23:19:17+00:00 [3/10] Local server listening - hooray!
2020-01-28 23:20:22+00:00 [4/10] Local server listening - hooray!
2020-01-28 23:21:13+00:00 [5/10] [!] No listening local server
2020-01-28 23:22:06+00:00 [6/10] Local server listening - hooray!
2020-01-28 23:23:04+00:00 [7/10] Local server listening - hooray!
2020-01-28 23:23:59+00:00 [8/10] Local server listening - hooray!
2020-01-28 23:24:56+00:00 [9/10] Local server listening - hooray!
2020-01-28 23:25:56+00:00 [10/10] Local server listening - hooray!
2020-01-28 23:26:07+00:00 Stop measurements

For now, I'd consider this specific issue resolved and I'll file a new issue after I investigate further. I suspect there's something to do with the reboot and properly cleaning things up.

Thank you to everyone who contributed to investigation, and thank you @AkihiroSuda for the effort in switching Podman's rootless networking backend. 🙂

Apologies for the long delay; I've had several things to manage over the holidays and new year.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
do-not-close kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants