Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Policy|Plugin] Provide a Container Runtime Abstraction #1873

Closed

Conversation

TurboTurtle
Copy link
Member

Adds a new ContainerRuntime abstraction class that should streamline the use/probing of a container runtime by plugins.

Plugins now have access to new methods that allow them to query for the presence of containers, images, and volumes without needing to manually specify different runtime commands to run through.

The OpenStack plugins that were doing these checks have been updated to now use the loaded ContainerRuntime initialized with Policy, as have the docker and podman plugins to get their element lists.


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?
  • If this commit closes an existing issue, is the line Closes: #ISSUENUMBER included in an independent line?
  • If this commit resolves an existing pull request, is the line Resolves: #PRNUMBER included in an independent line?

@lgtm-com
Copy link

lgtm-com bot commented Dec 3, 2019

This pull request introduces 1 alert when merging 9ad3d60 into 7eb369e - view on LGTM.com

new alerts:

  • 1 for __init__ method calls overridden method

TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 3, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 9ad3d60 to 39d9754 Compare December 3, 2019 19:14
@TurboTurtle
Copy link
Member Author

Re-pushed to address the LGTM alert.

This patchset will need some better-than-normal testing across different environments before it is merge ready, since it touches a heck of a lot especially for layered technology deployments.

@danalsan would you be able to give this a check from an OVN perspective?

@battlemidget @BryanQuigley I'm fairly certain this change as written would actually break a decent amount of docker plugin collection on Ubuntu/Debian hosts (mainly inspect output - config collection should still be fine) mainly because the check within DockerContainerRuntime relies on an InitSystem within the Policy and those distros haven't hooked up to that internally yet.

I think the best solution there is to have those policies define an InitSystem, thoughts?

TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 3, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 39d9754 to 62b2282 Compare December 3, 2019 20:46
@lgtm-com
Copy link

lgtm-com bot commented Dec 3, 2019

This pull request introduces 1 alert when merging 62b2282 into 7eb369e - view on LGTM.com

new alerts:

  • 1 for __init__ method calls overridden method

Copy link
Contributor

@danalsan danalsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jake, this looks great! I have no objections to the ovn_central plugin changes.

sos/plugins/__init__.py Outdated Show resolved Hide resolved
sos/plugins/__init__.py Outdated Show resolved Hide resolved
sos/plugins/__init__.py Outdated Show resolved Hide resolved
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 10, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 62b2282 to ffccf07 Compare December 10, 2019 20:31
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 10, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from ffccf07 to 02353f2 Compare December 10, 2019 20:35
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 10, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 02353f2 to 756c9fe Compare December 10, 2019 20:39
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 10, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 756c9fe to f9b91b2 Compare December 10, 2019 20:44
@TurboTurtle
Copy link
Member Author

Updated to address @pmoravec 's points. Also, now there is support for multiple runtimes on a single system. This should almost never be the case, but it could feasibly be configured and we'd still want to collect data from both. Since the podman and docker plugins independently specify their own runtimes for use, this should be unobtrusive to everything.

@danalsan were you able to test the ovn_central plugin or was that an eyes-only check? I don't have anything on my end with OVN deployed, so it'd be really nice to have some "real" testing on that front.

else:
self._log_info("Cannot run cmd '%s' in container %s: no such "
"container is running." % (cmd, container))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall not we call return _default here? (well, current behaviour will call docker/podman cmd on non-existing container that works "well" also).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. it might be valuable to let execute that command and get something like:

Error: error getting image "3848ec06ffb6": unable to find a name and tag match for 3848ec06ffb6 in repotags: no such image

No preferences from my side here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, fair point on executing it anyway and collecting the failure. I'll think this over a bit more.

@danalsan
Copy link
Contributor

@danalsan were you able to test the ovn_central plugin or was that an eyes-only check? I don't have anything on my end with OVN deployed, so it'd be really nice to have some "real" testing on that front.

It's failing for me on RHEL 8.1. Timing out on rabbit and podman plugins so perhaps we need to extend this commit 0c9a1f0 to other plugins... I didn't catch this earlier when we tested it with @bmr-cymru

[root@controller-0 ~]# ps -aef|grep timeout
root 864461 849771 0 09:15 pts/0 00:00:00 timeout 300s docker exec -t rabbitmq-bundle-podman-0 rabbitmqctl report
root 865838 849771 0 09:15 pts/0 00:00:00 timeout 300s podman inspect 92c3c0e2867e

Plugin podman timed out
Plugin rabbitmq timed out

Podman version:
podman-1.4.2-5.module+el8.1.0+4240+893c1ab8.x86_64

This seems unrelated to this particular PR but looks like there's another issue and ovn_central is not detected as enabled but it does in the same environment using master branch.

Hope it helps,
Daniel

PS. I won't be around in the coming days so apologies for the delays

@pmoravec
Copy link
Contributor

From my code review and testing on podman and docker, I can ACK it.

About the Dan's comment:
It makes sense to extend this PR to rabbitmq plugin as well, since it calls docker commands and work with containers like other updated plugins.

For the plugins timeouts:

  • that is a separate issue (podman bug) that we can workaround by adding timeout=0 to each(?) podman call in relevant plugins - could you @danalsan raise an issue/PR for it (or state what particular cmds needs such change)?
  • why rabbitmq timeouts, as it does not call podman command? By hanging command timeout 300s docker exec -t rabbitmq-bundle-podman-0 rabbitmqctl report ? So are docker calls affected by the same problem like https://bugzilla.redhat.com/show_bug.cgi?id=1732525 ?
  • do I interpret the ps output that both docker and podman are deployed on that system? My understanding from this discussion is this is not expected (if it is expected, current PR wont detect both, as ContainerRuntime sets binary to either of docker or podman but cant call both).

@TurboTurtle
Copy link
Member Author

If both are present, then the policy-default one will be used whenever a specific runtime isn't specified. For RHEL systems, this default is podman.

I'm hesitant to remove timeout from all podman commands, even as a workaround to a bug within podman - and especially if we see the same behavior from docker commands, which shouldn't be subject to the same underlying root cause (i.e. if docker is also failing, then we most likely have something weird going on on the host, in my opinion).

@danalsan
Copy link
Contributor

From my code review and testing on podman and docker, I can ACK it.

About the Dan's comment:
It makes sense to extend this PR to rabbitmq plugin as well, since it calls docker commands and work with containers like other updated plugins.

For the plugins timeouts:

  • that is a separate issue (podman bug) that we can workaround by adding timeout=0 to each(?) podman call in relevant plugins - could you @danalsan raise an issue/PR for it (or state what particular cmds needs such change)?

So far the only ones I mentioned is the ones I'm aware of. Until EOY I won't have time to handle this time but I'll try my best to get some time on it after that.

  • why rabbitmq timeouts, as it does not call podman command? By hanging command timeout 300s docker exec -t rabbitmq-bundle-podman-0 rabbitmqctl report ? So are docker calls affected by the same problem like https://bugzilla.redhat.com/show_bug.cgi?id=1732525 ?

'docker' is an alias for podman on OpenStack deployments, this is why you see docker in the commandline and it's confusing as it's really running podman. Sorry I should've clarified this.

# yum whatprovides docker
podman-docker-1.0.0-2.git921f98f.module+el8.0.0+2958+4e823551.noarch : Emulate Docker CLI using podman
  • do I interpret the ps output that both docker and podman are deployed on that system? My understanding from this discussion is this is not expected (if it is expected, current PR wont detect both, as ContainerRuntime sets binary to either of docker or podman but cant call both).
    ditto

TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 17, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from f9b91b2 to 11a5ea9 Compare December 17, 2019 17:50
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Dec 17, 2019
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 11a5ea9 to 9379f79 Compare December 17, 2019 18:28
@TurboTurtle
Copy link
Member Author

I had pushed an update to this patchset that added changes to rabbitmq which dropped the independent logs collection. After thinking this over a bit more, I don't think that is the right move - so I pushed a second update which re-added that, via an added Plugin.get_container_logs() method.

Also, further fixed up the docstring messages to actually make sense this time.

@pmoravec
Copy link
Contributor

'docker' is an alias for podman on OpenStack deployments, this is why you see docker in the commandline and it's confusing as it's really running podman. Sorry I should've clarified this.

Can't this be an issue for the https://github.com/sosreport/sos/pull/1873/files#diff-c281e660b4e6e9273cdbb534ba04dc36R973 code where we decide the container runtime implementation? But till this PR works well on OpenStack, I am fine with this.

@danalsan
Copy link
Contributor

'docker' is an alias for podman on OpenStack deployments, this is why you see docker in the commandline and it's confusing as it's really running podman. Sorry I should've clarified this.

Can't this be an issue for the https://github.com/sosreport/sos/pull/1873/files#diff-c281e660b4e6e9273cdbb534ba04dc36R973 code where we decide the container runtime implementation? But till this PR works well on OpenStack, I am fine with this.

Right, this is why I flipped the order on the ovn_central plugin and checked for 'podman' first at
https://github.com/sosreport/sos/pull/1873/files#diff-7e5b784f4c6844be676647d273762167L66

Copy link
Contributor

@pmoravec pmoravec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK from me. As it is a bigger change, I would prefer an ack from yet another approver.

@TurboTurtle TurboTurtle added this to the 4.0 milestone Jan 28, 2020
bmr-cymru pushed a commit that referenced this pull request Feb 14, 2020
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: #1873

Signed-off-by: Jake Hunsaker <[email protected]>
@bmr-cymru
Copy link
Member

I think this is close and it passes basic smoke testing for me (rebased on current master), but I think it's too big a change to take at the last minute before 3.9 - we can finish review once the release is out and distributions that want to take on the functionality now can take a patch from master once it's merged.

@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from 9379f79 to f2be094 Compare February 17, 2020 19:24
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Feb 17, 2020
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle
Copy link
Member Author

Latest update is rebasing on current master following the 3.9 tag.

Adds a `ContainerRuntime()` class to allow policies to specify a
container runtime to allow plugins to utilize.

The `ContainerRuntime` is intended to allow for the discovery of
containers, including specific ones by name, and for the execution of
commands inside those containers.

This is meant to remove the overhead of manually defining ways to
determine an active runtime and if a component is containerized within
plugins.

Related: sosreport#1866

Signed-off-by: Jake Hunsaker <[email protected]>
Integrates the ContainerRuntime checks to the Plugin API -
`Plugin.exec_cmd()` now accepts a `container` argument that, if set,
will first check to see if the container exists and if it does, format
the requested command to be executed by the runtime in the selected
container. If the runtime is not available, or the container does not
exist, the command is not run.

Additionally adds a `container_exists()` call to check to see if a
container is present/running, and a `fmt_container_cmd()` call that
wraps the `ContainerRuntime` method to allow for easy formatting of
commands for `add_cmd_output()` calls.

Closes: sosreport#1866

Signed-off-by: Jake Hunsaker <[email protected]>
Removes the duplicated container runtime checks from individual plugins,
and updates them to use the new methods exposed in `Plugin`, thus
reducing the number of times sos calls out to the runtime.

Signed-off-by: Jake Hunsaker <[email protected]>
Updates `ovn_central` to use the new ContainerRuntime methods exposed to
the `Plugin` class.

Rather than manually checking for runtimes, and then manually building
the exec commands, the plugin now relies on the `Plugin` methods for
discovering containers and formatting commands to be run in a container,
if applicable.

Signed-off-by: Jake Hunsaker <[email protected]>
Updates the `rabbitmq` plugin to use the new ContainerRuntime
abstraction.

Signed-off-by: Jake Hunsaker <[email protected]>
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Signed-off-by: Jake Hunsaker <[email protected]>
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle TurboTurtle force-pushed the container-runtime-policy branch from f2be094 to d191a98 Compare February 17, 2020 20:14
TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Mar 27, 2020
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
Copy link
Contributor

@BryanQuigley BryanQuigley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be mostly a no-op for Ubuntu, noted tasks in #1996

TurboTurtle added a commit to TurboTurtle/sos that referenced this pull request Apr 6, 2020
Updates the plugin to use the new `ContainerRuntime` abstraction
provided by the loaded `Policy` rather than shelling out again to get
the same data loaded during `Policy` initialization.

Resolves: sosreport#1873

Signed-off-by: Jake Hunsaker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants