Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test/system/250-systemd.bats: fix flake #16852

Closed
wants to merge 1 commit into from

Conversation

vrothberg
Copy link
Member

Fix a flake in the kube-template test. After stopping the service, we want to make sure that the service container gets removed. However, ther is a small race window. systemctl stop will return when the service container exits. In between that and the container exists check, the service container may have not yet been removed. Hence, add a loop to account for that race.

Fixes: #16047
Signed-off-by: Valentin Rothberg [email protected]

Does this PR introduce a user-facing change?

None

Fix a flake in the kube-template test.  After stopping the service, we
want to make sure that the service container gets removed.  However,
ther is a small race window. `systemctl stop` will return when the
service container _exits_.  In between that and the `container exists`
check, the service container may have not yet been removed.  Hence, add
a loop to account for that race.

Fixes: containers#16047
Signed-off-by: Valentin Rothberg <[email protected]>
@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 15, 2022
@vrothberg vrothberg marked this pull request as ready for review December 15, 2022 13:54
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 15, 2022
@vrothberg
Copy link
Member Author

@containers/podman-maintainers PTAL

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 15, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always thought systemctl stop and systemctl start are blocking? So I assume the commands in the unit should have been all be finished by the time the systemctl command finishes.

@vrothberg
Copy link
Member Author

I always thought systemctl stop and systemctl start are blocking? So I assume the commands in the unit should have been all be finished by the time the systemctl command finishes.

You made me doubt. I will do some more/better research with more coffee tomorrow..

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 15, 2022
@Luap99
Copy link
Member

Luap99 commented Dec 15, 2022

I found this in systemctl(1):

--no-block
Do not synchronously wait for the requested operation to finish. If this is not specified, the job will be verified, enqueued and systemctl will wait until the unit's start-up is completed. By passing this
argument, it is only verified and enqueued. This option may not be combined with --wait.

It only mentions the start part but not stop.

@@ -443,7 +443,14 @@ EOF

# Clean up
systemctl stop $service_name
run_podman 1 container exists $service_container
for i in {0..5}; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do a podman wait $service_container

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. Once #16853 is in, we can.

vrothberg added a commit to vrothberg/libpod that referenced this pull request Dec 16, 2022
In the recent past, I met the frequent need to wait for a container to
exist that, at the same time, may get removed (e.g., system tests in [1]).

Add an `--ignore` option to podman-wait which will ignore errors when a
specified container is missing and mark its exit code as -1.  Also
remove ID fields from the WaitReport.  It is actually not used by
callers and removing it makes the code simpler and faster.

Once merged, we can go over the tests and simplify them.

[1] github.com/containers/pull/16852

Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg
Copy link
Member Author

Closing. The test is doing the right thing. systemctl stop is executing kube down, so the service container should have been removed already.

Must be something else but I was chasing a ghost.

@vrothberg vrothberg closed this Dec 16, 2022
@vrothberg vrothberg deleted the fix-16407 branch December 16, 2022 12:12
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 17, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[email protected] template: container exists, it shouldn't
4 participants