Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8systemd: run k8s workloads in systemd #14256

Merged
merged 1 commit into from
May 18, 2022

Conversation

vrothberg
Copy link
Member

Support running podman play kube in systemd by exploiting the
previously added "service containers". During play kube, a service
container is started before all the pods and containers, and is stopped
last. The service container communicates its conmon PID via sdnotify.

Add a new systemd template to dispatch such k8s workloads. The argument
of the template is the path to the k8s file. Note that the path must be
escaped for systemd not to bark:

Let's assume we have a top.yaml file in the home directory:

$ escaped=$(systemd-escape ~/top.yaml)
$ systemctl --user start podman-play-kube@$escaped.service

Closes: https://issues.redhat.com/browse/RUN-1287
Signed-off-by: Valentin Rothberg [email protected]

Does this PR introduce a user-facing change?

Add new `[email protected]` systemd template to dispatch K8s YAML files in systemd.
The path to the YAML file must be escaped: `systemctl --user start podman-play-kube@$(systemd-escape $YAML).service`

@mheon @baude @rhatdan PTAL

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 16, 2022
@vrothberg
Copy link
Member Author

@edsantiago, I would love your eyes to have a look at the tests.

@vrothberg vrothberg force-pushed the run-1287 branch 4 times, most recently from 82e8e37 to dbb6ea3 Compare May 16, 2022 11:45
@vrothberg
Copy link
Member Author

Fixed the is-failed check in the tests

Copy link
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests fail; I'll review again once tests pass

test/system/250-systemd.bats Outdated Show resolved Hide resolved
test/system/250-systemd.bats Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member Author

Integration networking tests are very flaky atm.

@edsantiago
Copy link
Member

Please use this: 250-systemd.bats.txt

(too big a diff, easier to just post new file).

I can't actually get it to pass, the podman rm -a fails with:

  $ .../bin/podman rm -a                                                                                                
   4a2a246fd96275bd339ac756be888de9d280e43b7403da261c631f4fc76e75fe                                                                                             
   6978083adf9c0787462493ab0ae1f4c2a9053ab8c532e6db2014dfde29d035c9                                                                                             
   Error: container 05952888d26655a6c6bae3f8429f1dd9b8fbb899739210ccebe758bd60268d29 does not exist in database: no such container                              
   [ rc=1 (** EXPECTED 0 **) ]             

...but I'll leave that for you to look into

pkg/domain/infra/abi/play.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member Author

Please use this: 250-systemd.bats.txt

(too big a diff, easier to just post new file).

Absolute great work, @edsantiago. Thank you!

@vrothberg
Copy link
Member Author

@mheon @edsantiago PTanotherL. Thanks for the reviews!

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 16, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [edsantiago,vrothberg]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@edsantiago
Copy link
Member

Tests LGTM. Thanks for solving the "does not exist" issue and for finding/fixing the other instances of hardcoded hash.

@mheon
Copy link
Member

mheon commented May 16, 2022

LGTM

@edsantiago
Copy link
Member

System test failing and looks real:

         # $ podman pod inspect test_pod --format {{.State}}
         # Exited
         # $ podman container inspect b9180307e1fd-service --format {{.State.Running}}
         # true
         # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
         # #|     FAIL: podman container inspect b9180307e1fd-service --format {{.State.Running}}
         # #| expected: 'false'
         # #|   actual: 'true'
         # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@vrothberg
Copy link
Member Author

System test failing and looks real:

Looks like a race in the tests I didn't anticipate before. Glad it flaked in this PR and not after.

Support running `podman play kube` in systemd by exploiting the
previously added "service containers".  During `play kube`, a service
container is started before all the pods and containers, and is stopped
last.  The service container communicates its conmon PID via sdnotify.

Add a new systemd template to dispatch such k8s workloads.  The argument
of the template is the path to the k8s file.  Note that the path must be
escaped for systemd not to bark:

Let's assume we have a `top.yaml` file in the home directory:
```
$ escaped=$(systemd-escape ~/top.yaml)
$ systemctl --user start podman-play-kube@$escaped.service
```

Closes: https://issues.redhat.com/browse/RUN-1287
Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg
Copy link
Member Author

System test failing and looks real:

Looks like a race in the tests I didn't anticipate before. Glad it flaked in this PR and not after.

Fixed now o/

@vrothberg
Copy link
Member Author

Ready.

@edsantiago @rhatdan @mheon

for i in {0..5}; do
run_podman container inspect $1 --format "{{.State.Running}}"
if [[ $output == "$2" ]]; then
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best part of functions is that you can simply return. Then the fallthrough can simply be die "Timed out waiting for .State.Running on $1 to be $2.

But it's not worth a repush, this is just something for next time. LGTM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining! Looks like I need to ramp up my bash skills a bit further.

@rhatdan rhatdan added the kube label May 17, 2022
@edsantiago
Copy link
Member

Tests LGTM; the rest is more than I can approve

@edsantiago edsantiago added kind/feature Categorizes issue or PR as related to a new feature. release-note labels May 18, 2022
@rhatdan
Copy link
Member

rhatdan commented May 18, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 18, 2022
@openshift-merge-robot openshift-merge-robot merged commit 12964c7 into containers:main May 18, 2022
@vrothberg vrothberg deleted the run-1287 branch May 18, 2022 13:52
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. kube lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants