Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: pods and container names in /etc/hosts: cleanup race? #23811

Open
edsantiago opened this issue Aug 29, 2024 · 2 comments
Open

CI: pods and container names in /etc/hosts: cleanup race? #23811

edsantiago opened this issue Aug 29, 2024 · 2 comments
Assignees
Labels
flakes Flakes from Continuous Integration stale-issue

Comments

@edsantiago
Copy link
Member

Almost certainly a test bug:

@test "podman pod manages /etc/hosts correctly" {
local pod_name=pod-$(random_string 10)
local infra_name=infra-$(random_string 10)
local con1_name=con1-$(random_string 10)
local con2_name=con2-$(random_string 10)
run_podman pod create --name $pod_name --infra-name $infra_name
pid="$output"
run_podman run --pod $pod_name --name $con1_name $IMAGE cat /etc/hosts
is "$output" ".*\s$pod_name $infra_name.*" "Pod hostname in /etc/hosts"
is "$output" ".*127.0.0.1\s$con1_name.*" "Container1 name in /etc/hosts"
# get the length of the hosts file
old_lines=${#lines[@]}
# since the first container should be cleaned up now we should only see the
# new host entry and the old one should be removed (lines check)
run_podman run --pod $pod_name --name $con2_name $IMAGE cat /etc/hosts
is "$output" ".*\s$pod_name $infra_name.*" "Pod hostname in /etc/hosts"
is "$output" ".*127.0.0.1\s$con2_name.*" "Container2 name in /etc/hosts"
is "${#lines[@]}" "$old_lines" "Number of hosts lines is equal"

(I can't link to my parallel version).

Test is failing in parallel mode, only on my laptop. Haven't seen it fail in CI. Failure is in the last two lines shown above, basically, container1 is still showing up in /etc/hosts.

Is this expected? I'm going to try adding --rm to the first podman run and see if the failure vanishes, but I'm not sure if that's the right thing to do.

@edsantiago edsantiago added the flakes Flakes from Continuous Integration label Aug 29, 2024
@edsantiago edsantiago self-assigned this Aug 29, 2024
@Luap99
Copy link
Member

Luap99 commented Aug 30, 2024

Adding --rm will fix it because the run will remove the container at the end which means cleanup will be done before,

Without it podman run ... waits for the container to exit, it does not wait for the container to be fully cleaned up. The actual cleanup happens via podman container cleanup process in the background which is causing the race here. In practise it is a bit more complicated.
For a long running process it will always work because we wait for conmon to exit and conmon waits for the podman container cleanup process to finish first so cleanup is done in most of the cases since #23601. However a short running process such as cat might have exited before we call into WaitForExit() so we no longer wait for conmon and exit earlier there which #23646 was about.

We could try doing an explicit cleanup call at the end but this wouldn't really work via remote API so I rather not do it for only local podman. The --rm fix should work.

Copy link

A friendly reminder that this issue had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flakes Flakes from Continuous Integration stale-issue
Projects
None yet
Development

No branches or pull requests

2 participants