Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improved "containers/{name}/wait" endpoint #10271

Merged

Conversation

matejvasek
Copy link
Contributor

@matejvasek matejvasek commented May 7, 2021

Previous implementation wasn't working with short living containers.
Polling missed the very brief moment the container was running.

Now Event API is used instead of polling.

resolves #10256

@matejvasek
Copy link
Contributor Author

/cc @mheon

@openshift-ci-robot openshift-ci-robot requested a review from mheon May 7, 2021 20:40
@matejvasek matejvasek force-pushed the fix-wait-next-exit branch from 4acf161 to dd614e4 Compare May 7, 2021 21:26
Copy link
Member

@rhatdan rhatdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One Nit

pkg/api/handlers/utils/containers.go Outdated Show resolved Hide resolved
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: matejvasek, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2021
@mheon
Copy link
Member

mheon commented May 8, 2021 via email

@matejvasek
Copy link
Contributor Author

Looks like this could leak goroutines - think you need to close the channel once you have the event so event reading terminates

@mheon Good catch. Closing the eventChannel would cause panic as the gorutine would try to write to closed channel. Correct way to terminate the event watcher is to cancel the passed context. The context here is done once the HTTP request is ended. So in the end there is no leak. However I will add explicit cancellation for clarity.

pkg/api/handlers/utils/containers.go Outdated Show resolved Hide resolved
pkg/api/handlers/utils/containers.go Outdated Show resolved Hide resolved
pkg/api/handlers/utils/containers.go Show resolved Hide resolved
pkg/api/handlers/utils/containers.go Show resolved Hide resolved
@matejvasek
Copy link
Contributor Author

matejvasek commented May 10, 2021

@mheon I tried to compile podman without systemd support and this is not working. Any idea why?

Edit: my bad it works

@matejvasek matejvasek force-pushed the fix-wait-next-exit branch 2 times, most recently from ebd088d to e26ba89 Compare May 10, 2021 10:24
pkg/api/handlers/utils/containers.go Outdated Show resolved Hide resolved
pkg/api/handlers/utils/containers.go Outdated Show resolved Hide resolved
@matejvasek matejvasek force-pushed the fix-wait-next-exit branch 2 times, most recently from 72f22b6 to 2141918 Compare May 10, 2021 11:11
Using event API to detect changes to container instead of polling.
Polling was unreliable, sometime change of a state might have been
missed.

Signed-off-by: Matej Vasek <[email protected]>
@matejvasek matejvasek force-pushed the fix-wait-next-exit branch from 2141918 to 66e38ca Compare May 10, 2021 11:40
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@rhatdan
Copy link
Member

rhatdan commented May 10, 2021

/approve
/lgtm
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2021
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 10, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 10, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: matejvasek, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@TomSweeneyRedHat
Copy link
Member

fedora-33 root test is failing and I don't think it's a flake.

[+0650s] not ok 153 podman pod create - hashtag AllTheOptions
[+0650s] # (from function `die' in file test/system/helpers.bash, line 412,
[+0650s] #  from function `run_podman' in file test/system/helpers.bash, line 220,
[+0650s] #  in test file test/system/200-pod.bats, line 208)
[+0650s] #   `run_podman build -t $infra_image - << EOF' failed with status 125
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman rm --all --force
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman ps --all --external --format {{.ID}} {{.Names}}
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman images --all --format {{.Repository}}:{{.Tag}} {{.ID}}
[+0650s] # quay.io/libpod/testimage:20210427 aadc32e2a626
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman build -t infra_ho2waiwlxz -
[+0650s] # STEP 1: FROM quay.io/libpod/testimage:20210427
[+0650s] # STEP 2: RUN ln /home/podman/pause /pause_2Izc4dQiEZ
[+0650s] # open state file /proc/122860/stat: No such process
[+0650s] # open pidfd: No such process
[+0650s] # error running container: error reading container state from /usr/bin/crun (got output: ""): exit status 1
[+0650s] # Error: error building at STEP "RUN ln /home/podman/pause /pause_2Izc4dQiEZ": error while running runtime: exit status 1
[+0650s] # [ rc=125 (** EXPECTED 0 **) ]
[+0650s] # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
[+0650s] # #| FAIL: exit code is 125; expected 0
[+0650s] # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman pod rm -f -a
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman rm -f -a
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman image list --format {{.ID}} {{.Repository}}
[+0650s] # aadc32e2a626 quay.io/libpod/testimage
[+0650s] # # [teardown]
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman pod rm --all --force
[+0650s] # # /var/tmp/go/src/github.com/containers/podman/bin/podman rm --all --force

@edsantiago
Copy link
Member

@TomSweeneyRedHat yes, "open state file" is a flake, we hope it's fixed in containers/crun#661 but I assume it will take a VM-rebuild dance to get it fixed in CI.

@rhatdan
Copy link
Member

rhatdan commented May 10, 2021

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2021
@openshift-merge-robot openshift-merge-robot merged commit 57b6425 into containers:master May 10, 2021
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Docker APIv2 container wait reliability
8 participants