Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Expose podman service outside the VM #874

Closed
zeenix opened this issue Dec 9, 2019 · 28 comments
Closed

Spike: Expose podman service outside the VM #874

zeenix opened this issue Dec 9, 2019 · 28 comments
Assignees
Labels
kind/spike Investigation to provide direction and workable tasks os/macos points/3 priority/major

Comments

@zeenix
Copy link
Contributor

zeenix commented Dec 9, 2019

The use case here is to enable the use of podman CLI on the host to manage containers inside the CRC VM.

@gbraad
Copy link
Contributor

gbraad commented Jan 13, 2020

This task gets priority over other work that was assigned. I'll contact you later about this ...

@gbraad
Copy link
Contributor

gbraad commented Jan 13, 2020

Can you detail some of the findings so we can create an actual task to work on.

What is the result of the spike?

@zeenix
Copy link
Contributor Author

zeenix commented Jan 13, 2020

@gbraad IIRC I never managed to actually do anything before I left for PTO.

@gbraad
Copy link
Contributor

gbraad commented Jan 14, 2020

Can you at least in that case detail what you believe needs to be done, and discuss with @praveenkumar about his previous effort regarding this.

@gbraad
Copy link
Contributor

gbraad commented Jan 16, 2020

from today till Monday @zeenix will spent time on identifying the missing pieces and a basic method to start podman for remote access.

the image might need additional plackages installed. if so, we can make this part of snc.

for the client we will deliver podman-remote to allow access from the client to the host machine (the VM). We can re-use the information of the needed SSH connection as done for our deployment interaction.

Note, we are looking into basic functionality here. improvements are part of future work (just a spike to identify needed work and options).

@rhatdan
Copy link

rhatdan commented Jan 16, 2020

The only thing necessary is to enable the these sockets for socket activation

/usr/lib/systemd/system/io.podman.service
/usr/lib/systemd/system/io.podman.socket
/usr/lib/systemd/user/io.podman.service
/usr/lib/systemd/user/io.podman.socket

Then to allow access from a remote podman over ssh connections.

@rhatdan
Copy link

rhatdan commented Jan 16, 2020

You should be able to install a podman on a MAC via brew and have it talk to podman on the server.

@zeenix
Copy link
Contributor Author

zeenix commented Jan 17, 2020

Preliminary findings

As promised, I looked into this. I mainly/first followed this guide to setup podman with varlink in the CRC VM. First thing I found was that, while libvarlink is part of the RHCOS image, varlink (part of libvarlink-util rpm) binary is not. So for now, I just copied it over from my Fedora 31 host to /tmp in the VM. Rest was pretty easy to just copy&paste mostly. One thing that I did differently was to reuse core user and group, instead of special podman one the guide recommends.

After the setup, I was able to communicate from the host with the podman on the CRC VM through the python API:

$ python -m varlink.cli --bridge "ssh 192.168.130.11" call io.podman.ListContainers
{
  "containers": [
    {
      "command": [
        "/sbin/init"
      ],
      "containerrunning": true,
      "createdat": "2020-01-17T11:08:03Z",
      "id": "3d9d9e87f5069075c28e437f4829c8e212b5b252d0af3406a1104f4bb25e3116",
      "image": "quay.io/crcont/dnsmasq:latest",
      "imageid": "851bb0e5bf751cba2d649612a47651890a86eafe629308e1b3273c16b71b047e",
      "labels": {
        "org.label-schema.build-date": "20190305",
        "org.label-schema.license": "GPLv2",
        "org.label-schema.name": "CentOS Base Image",
        "org.label-schema.schema-version": "1.0",
        "org.label-schema.vendor": "CentOS"
      },
      "mounts": [
      # ...
$ python -m varlink.cli --bridge "ssh 192.168.130.11" call io.podman.GetInfo
{
  "info": {
    "host": {
      "arch": "amd64",
      "buildah_version": "1.12.0-dev",
      "cpus": 4,
      "distribution": {
        "distribution": "\"rhcos\"",
        "version": "4.3"
      },
      "eventlogger": "journald",
      "hostname": "crc-m2n9t-master-0",
      "kernel": "4.18.0-147.3.1.el8_1.x86_64",
      "mem_free": 254738432,
      "mem_total": 7966154752,
      "os": "linux",
      "swap_free": 0,
      "swap_total": 0,
      "uptime": "1h 9m 48.38s (Approximately 0.04 days)"
    },
    "insecure_registries": null,
    "podman": {
      "compiler": "gc",
      "git_commit": "",
      "go_version": "go1.13.4",
      "podman_version": "1.6.4"
    },
    "registries": null,
    "store": {
      "containers": 134,
      "graph_driver_name": "overlay",
      "graph_driver_options": "map[]",
      "graph_root": "/var/lib/containers/storage",
      "graph_status": {
        "backing_filesystem": "xfs",
        "native_overlay_diff": "true",
        "supports_d_type": "true"
      },
      "images": 65,
      "run_root": "/var/run/containers/storage"
    }
  }
}

Some commands don't seem to work for some reason:

$ python -m varlink.cli --bridge "ssh 192.168.130.11" call io.podman.Ping {}
{'parameters': {'method': 'Ping'}, 'error': 'org.varlink.service.MethodNotFound'}
$ python -m varlink.cli --bridge "ssh 192.168.130.11" call io.podman.ListImages
Connection closed

ListImages do seem to work when called locally through varlink though:

$ /tmp/varlink call unix:/run/podman/io.podman/io.podman.ListImages|head
{
  "images": [
    {
      "containers": 0,
      "created": "2020-01-07T23:20:01Z",
      "digest": "sha256:3bada34ebed01542891c576954844afa164f087b6d8081e23f6f1724600b1f2e",
      "id": "076c2c01b0e2d22e31c9ba50b07765773a9cc211b060003cba473afe64d65f89",
      "isParent": false,
      "labels": {
        "com.coreos.ostree-commit": "2497f5d4993087b8c879e0e4faab0bfba6bc0cac131af350d0654b34a7dfcfd9",

Talking of things not working, I didn't manage to get podman-remote working with SSH directly (not sure if it's supposed to):

/bin/podman-remote container list --remote-host 192.168.130.11
Cannot execute command-line and remote command.
Error: unexpected EOF

but it works fine through TCP port forwaring as decribed here:

$ PODMAN_VARLINK_ADDRESS="tcp:127.0.0.1:1234" /bin/podman-remote container list
CONTAINER ID  IMAGE                          COMMAND     CREATED      STATUS          PORTS               NAMES
3d9d9e87f506  quay.io/crcont/dnsmasq:latest  /sbin/init  2 hours ago  Up 2 hours ago  0.0.0.0:53->53/udp  dnsmasq
$ PODMAN_VARLINK_ADDRESS="tcp:127.0.0.1:1234" /bin/podman-remote top 3d9d9e87f506
USER   PID   PPID   %CPU    ELAPSED              TTY   TIME   COMMAND
root   1     0      0.000   2h21m50.509231061s   ?     0s     /sbin/init 
root   18    1      0.047   2h21m50.509749422s   ?     4s     /usr/lib/systemd/systemd-journald 
root   30    1      0.000   2h21m50.509987637s   ?     0s     /usr/lib/systemd/systemd-udevd 
dbus   125   1      0.000   2h21m50.510185375s   ?     0s     /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation 
root   233   1      0.012   2h21m49.510382194s   ?     1s     /usr/sbin/dnsmasq -k 
root   234   1      0.000   2h21m49.510591832s   ?     0s     /usr/lib/systemd/systemd-logind 
root   235   1      0.000   2h21m49.510794531s   ?     0s     /sbin/agetty --noclear tty1 linux 

@rhatdan
Copy link

rhatdan commented Jan 17, 2020

@jwhonce @bbaude @mheon PTAL

@gbraad
Copy link
Contributor

gbraad commented Jan 18, 2020

Last Friday @praveenkumar and I spoke about the possible strategies to expose this to the user as part of the user story. We identified the following 3 possiblities:

  1. We use a lean VM approach by re-using the current RHCOS and not starting the kubelet service. This means we can re-use the current setup (while prepared during the build phase), but might have to handle a situation to allow the start the OpenShift cluster afterwards;. This enables a scenario in which we do not need to share the resource initially between both podman and OpenShift. The problem is however, to handle the start situation as this impacts our current flow of interaction. While not impossible, it might look like:
$ crc start --podman
Starting CRC VM
Podman access available
$ crc status
CRC VM started
OpenShift cluster not available
Podman access available
$ crc start  # default --openshift is assumed
CRC VM started
Starting OpenShift cluster.
$ crc status
CRC VM started
OpenShift cluster started
...
Podman access available
  1. Start the full VM and allow access to podman after specifically requesting this. This simplfiies the start process and does not have to do existence checks, etc. But in this situation, the VM will consume resource to maintain both the OpenShift cluster and podman. This might be in a podman-only sitiation waste a significant amount of memory
$ crc start
CRC VM starting
OpenShift cluster started
$ crc podman-env
# CRC VM started, so we only need to expose access
Podman access available
$ crc status
OpenShift cluster started 
...
Podman access available
  1. A dedicated VM. While being a leaner aproach, this introduces additional conmplexity of maintaining another VM, allow VMs to co-exist, etc. While not impossible (our codebase can) it does introduce a resource overhead when both are used, but solves separation, etc.
$ crc start --podman
Podman access available
$ crc start --openshift    # default
OpenShift started
$ crc status
Podman VM started
OpenShift VM started
...

At the moment we decided to go for option 2, and work on a situation as described in 1 over time.

@code-ready/crc-devel PTAL

@gbraad
Copy link
Contributor

gbraad commented Jan 18, 2020

Preliminary findings
#874 (comment)

I think we can conclude the spike. Thanks. Let's discuss this on Monday and decide about the follow-up tasks.

@rhatdan
Copy link

rhatdan commented Jan 19, 2020

I like option #2, as well. It would be nice if all of the services were socket activated via systemd.

@zeenix
Copy link
Contributor Author

zeenix commented Jan 20, 2020

I also like option 2 and as @rhatdan says, we don't need to run the service needlessly but instead have it socket activated.

@gbraad
Copy link
Contributor

gbraad commented Jan 20, 2020

however, we need to make sure RHEL8 with podman and RHCOS + podman work identical. let's take a baseline with RHEL8 to make sure hat we see is not a 'known issue'. Note, we do use a userspace library/package from a slightly different version. our RHCOS is fixed/pinned to the openshift release.

Do note that option 2 involves a large(r) memory footprint and slower startup time as the OpenShift consumes these resources and time.

@zeenix
Copy link
Contributor Author

zeenix commented Jan 20, 2020

Do note that option 2 involves a large(r) memory footprint and slower startup time as the OpenShift consumes these resources and time.

I think we need to keep it clear that we're talking of "crc as only a podman installer" case here (i-e users downloading/installing CRC just for trying out podman only). In case of "podman as a side-feature of CRC", the resource usage is not much of an issue as creating/running an OS cluster is the primary goal of CRC.

@praveenkumar
Copy link
Member

So with 4.3 I can see the podman - 1.6.4 and libvarlink is installed but then also we need to install libvarlink-utils which is not part of default RHCOS and need to installed as part of disk creation like we are doing for hyperV, this package shouldn't be differ much from RHEL-8 side.

Below is what I followed and able to make connection with remote podman on mac catalina.

  • Installed the libvarlink-util-18-3.el8 to CRC VM and restarted the instance.
  • sudo systemctl start io.podman.socket inside the CRC VM
  • brew cask install podman on Mac

podman-remote config on the Mac but identity_file not honoured by podman so I have to manually add the host public key to VM's authorized_keys file.

$ cat ~/.config/containers/podman-remote.conf 
[connections]
    [connections.host1]
    destination = "192.168.64.92"
    username = "core"
    default = true
    ignore_hosts = true
    identity_file = "/Users/prkumar/.crc/machines/crc/id_rsa"

$ podman pull busybox
Trying to pull registry.access.redhat.com/busybox...
  name unknown: Repo not found
Trying to pull docker.io/library/busybox...
Getting image source signatures
Copying blob sha256:bdbbaa22dec6b7fe23106d2c1b1f43d9598cd8fc33706cc27c1d938ecd5bffc7
Copying config sha256:6d5fcfe5ff170471fcc3c8b47631d6d71202a1fd44cf3c147e50c8de21cf0648
Storing signatures
7813850d1ba44014914Storing signatures
I6d5fcfe5ff170471fcc3c8b47631d6d71202a1fd44cf3c147e50c8de21cf0648

$ podman images
REPOSITORY                  TAG      IMAGE ID       CREATED       SIZE
docker.io/library/busybox   latest   6d5fcfe5ff17   3 weeks ago   1.44 MB

$ podman run --rm -it busybox /bin/sh
/ # ls
ls
bin   dev   etc   home  proc  root  run   sys   tmp   usr   var
/ # exit

@gbraad
Copy link
Contributor

gbraad commented Jan 21, 2020

Changes to include the varlink-utils package
crc-org/snc#144
crc-org/snc#145

@gbraad
Copy link
Contributor

gbraad commented Jan 21, 2020

@zeenix any feedback on the snc changes that include needed varlink packages?

@zeenix
Copy link
Contributor Author

zeenix commented Jan 21, 2020

@gbraad the bundle @praveenkumar created against it, works for me and Praveen. On Fedora, we both get the same error on podman-remote images but other functionality (e.g podman-remote info and podman-remote pull.. seem to work just fine and out of the box.

@gbraad
Copy link
Contributor

gbraad commented Jan 23, 2020

Good to hear. So the findings are consistent. Do they however also occur when using podman-remote against a configured RHEL8 host?

@zeenix
Copy link
Contributor Author

zeenix commented Jan 23, 2020

Just checked and seems we can use environment variables instead of a configuration file:

$ rm ~/.config/containers/podman-remote.conf
$ podman-remote info
Error: could not get runtime: dial unix /run/podman/io.podman: connect: permission denied
$ PODMAN_USER=core PODMAN_HOST=192.168.130.11 PODMAN_IDENTITY_FILE=/home/zeenix/.crc/machines/crc/id_rsa PODMAN_IGNORE_HOSTS=1 podman-remote info
client:
  Connection: ssh -p 22 -T -i /home/zeenix/.crc/machines/crc/id_rsa -q -o StrictHostKeyChecking=no
    -o UserKnownHostsFile=/dev/null [email protected] -- varlink -A \'podman --log-level=error
    varlink \\\$VARLINK_ADDRESS\' bridge
  Connection Type: BridgeConnection
  OS Arch: linux/amd64
  Podman Version: 1.7.0
  RemoteAPI Version: 1
host:
  arch: amd64
  buildah_version: 1.12.0-dev
  cpus: 4
  distribution:
    distribution: '"rhcos"'
    version: "4.3"
  eventlogger: journald
  hostname: crc-zc45h-master-0
...

@zeenix
Copy link
Contributor Author

zeenix commented Jan 23, 2020

Good to hear. So the findings are consistent. Do they however also occur when using podman-remote against a configured RHEL8 host?

I bumped into a few hurdles installing a RHEL8 VM to test but I can do that now if needed.

@gbraad
Copy link
Contributor

gbraad commented Jan 24, 2020

we can use environment variables instead of a configuration file

This is easier for a command like crc podman-env 👍

@gbraad
Copy link
Contributor

gbraad commented Jan 24, 2020

but I can do that now if needed.

Do a quick test if those earlier reported commands that failed also (or not) fail on this VM. If so, this is something we have to escalate. (helps us to decide which action to take, as it might well be the podman versions in the RHCOS image aren't tested for all usecases. for the installer it only runs the initial etcd/cluster deployment).

@gbraad
Copy link
Contributor

gbraad commented Jan 27, 2020

Closing as spike has been concluded. Added #961 for follow-up

@gbraad gbraad closed this as completed Jan 27, 2020
@zeenix
Copy link
Contributor Author

zeenix commented Jan 27, 2020

Tested against RHEL8. It requires extra steps as even podman is not installed by default and you've to enable subscription etc before you can install anything. Once setup. I was able to recreate the same experience, except SSH auth was password-based (I failed to quickly enable key-based auth and I didn't want to spend more time than I already did).

@zeenix
Copy link
Contributor Author

zeenix commented Jan 30, 2020

Just for the record, I did all my testing against 4.3 bundle.

@afbjorklund
Copy link

Side note: any users that want to explore option 3 (separate VM) on their own, can use podman-machine for that. Totally separated env, for better and worse (mostly used as an alternative to running a local podman)

Similar functionality is available in minikube for docker-machine since a long time, and some users use it. Recent versions of minikube have now added a matching minikube podman-env command (varlink).

@gbraad gbraad added kind/spike Investigation to provide direction and workable tasks and removed kind/task Workable task labels Jun 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/spike Investigation to provide direction and workable tasks os/macos points/3 priority/major
Projects
None yet
Development

No branches or pull requests

5 participants