Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman machine writes absolute QEMU path to the VM config #13394

Closed
zowoq opened this issue Mar 1, 2022 · 20 comments · Fixed by #13620
Closed

podman machine writes absolute QEMU path to the VM config #13394

zowoq opened this issue Mar 1, 2022 · 20 comments · Fixed by #13620
Assignees
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine macos MacOS (OSX) related

Comments

@zowoq
Copy link

zowoq commented Mar 1, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

❯ head -n5 ~/.config/containers/podman/machine/qemu/podman-machine-default.json
{
 "CPUs": 1,
 "CmdLine": [
  "/nix/store/w4y0hnhq3amjc1g3wwzahwyl2rdnm8z7-qemu-6.2.0/bin/qemu-system-x86_64",
  "-m",

Using an absolute path is a bit problematic for nixpkgs as the QEMU written to the config might not be the same as the QEMU currently in the users PATH or it may have been removed which then causes an error.

❯ podman machine start
Starting machine "podman-machine-default"
INFO[0000] waiting for clients...
INFO[0000] new connection from  to /tmp/podman/qemu_podman-machine-default.sock
Error: fork/exec /nix/store/w4y0hnhq3amjc1g3wwzahwyl2rdnm8z7-qemu-6.2.0/bin/qemu-system-x86_64: no such file or directory

Output of podman version:

Client:       Podman Engine
Version:      4.0.1
API Version:  4.0.1
Go Version:   go1.17.7

Built:      Tue Jan  1 10:00:00 1980
OS/Arch:    darwin/amd64

Server:       Podman Engine
Version:      4.0.1
API Version:  4.0.1
Go Version:   go1.16.14

Built:      Sat Feb 26 03:05:11 2022
OS/Arch:    linux/amd64
@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 1, 2022
@github-actions github-actions bot added the macos MacOS (OSX) related label Mar 1, 2022
@baude
Copy link
Member

baude commented Mar 2, 2022

the path to qemu is discovered using FindExec at the time of init. So this is basically the same as using path at that time. It seems like you want to rediscover the qemu path each time we init, start, and stop?

@zowoq
Copy link
Author

zowoq commented Mar 2, 2022

It seems like you want to rediscover the qemu path each time we init, start, and stop?

I'd prefer that it doesn't record the path to qemu at all and just always use qemu-system-{aarch64,x86_64} in $PATH.

@baude
Copy link
Member

baude commented Mar 3, 2022

@zowoq understood ... and I sincerely sympathize. here is the problem. In the past, we had bug reports that basically pleaded for the hardcoded path because other products had "side-installed" qemu for something else and they prepended that to the path, and that qemu failed to boot vms. in fact, i am considering doing the same but not prepending it to path. I will keep you in mind though and see if we can't make this work for everyone.

@zowoq
Copy link
Author

zowoq commented Mar 3, 2022

I see FindHelperBinary was mentioned on your open PR, we're already using CONTAINERS_HELPER_BINARY_DIR for gvproxy on darwin, using it for the qemu binary could work for us.

From the same PR I see that some other paths are used on aarch64-darwin (getEdk2CodeFd) that I hadn't noticed previously, they won't work with nixpkgs. We don't have a way of getting this path besides looking for it relative to qemu binary. With how nix works it could be installed in one of multiple locations or not even installed at all, only run via a temporary session where it is added to PATH.

@baude
Copy link
Member

baude commented Mar 4, 2022

Yes, indeed. However, to make FindHelperBinary actually work, a change needs to be made in containers-common and be vendored through all of our container projects. I will create the PR for common once I am done getting the darwin mounting stuff working. very close now.

@Luap99 Luap99 added the machine label Mar 6, 2022
@tricktron
Copy link
Contributor

@zowoq I created the following file under /etc/containers/containers.conf.d/99-gvproxy-path.conf

[engine]
helper_binaries_dir = ["/etc/profiles/per-user/<user>/bin"]

Now podman writes the following path into the json: /etc/profiles/per-user/<user>/bin/qemu-system-aarch64.

So you can just enter your nix/nix-darwin/home-manager/nixos path there.

@zowoq
Copy link
Author

zowoq commented Mar 6, 2022

@tricktron I'm trying to get this fixed for all nixpkgs users so they don't need to deal with manual configuration.

@zowoq
Copy link
Author

zowoq commented Mar 8, 2022

Seems CONTAINERS_HELPER_BINARY_DIR isn't respected if it is changed after the vm is created?

> where qemu-system-x86_64
  qemu-system-x86_64 not found
> ./bin/darwin/podman machine rm --force
> CONTAINERS_HELPER_BINARY_DIR=/path/to/qemu_1 ./bin/darwin/podman machine init --now
> ./bin/darwin/podman machine stop
> rm -rf /path/to/qemu_1
> CONTAINERS_HELPER_BINARY_DIR=/path/to/qemu_2 ./bin/darwin/podman machine start

podman machine start should work with qemu_2 but instead it fails trying to use qemu_1?

@Luap99
Copy link
Member

Luap99 commented Mar 21, 2022

@zowoq understood ... and I sincerely sympathize. here is the problem. In the past, we had bug reports that basically pleaded for the hardcoded path because other products had "side-installed" qemu for something else and they prepended that to the path, and that qemu failed to boot vms. in fact, i am considering doing the same but not prepending it to path. I will keep you in mind though and see if we can't make this work for everyone.

How about keeping the hardcoded path but in case the binary is no longer there still fall back to the same detection logic as for machine init.

@zowoq
Copy link
Author

zowoq commented Mar 21, 2022

How about keeping the hardcoded path but in case the binary is no longer there still fall back to the same detection logic as for machine init.

An improvement as it would avoid the breakage from the example in the top post but it is still rather problematic for us. Since I opened this issue the /nix/store path for qemu has changed four or five times but unless the store path have been garbage collected the VM will still use the original qemu. Perhaps a better example would be a qemu version bump: currently the nixpkgs qemu is 6.2.0, if it was updated to 6.2.1 the VM will continue to use 6.2.0 until the store path is garbage collected.

@Luap99 Luap99 self-assigned this Mar 23, 2022
@Luap99 Luap99 added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Mar 23, 2022
@Luap99
Copy link
Member

Luap99 commented Mar 23, 2022

I think this is the best we can get: #13620

Luap99 added a commit to Luap99/libpod that referenced this issue Mar 23, 2022
We store the full path to qemu in the machine config. When the path
changes on the host the machine can longer be started. To fix it we get
the path again when we fail to start the machine due the missing binary.

We want to store and use the full path first because otherwise existing
machines can break when the qemu version changed.

[NO NEW TESTS NEEDED] We still have no machine tests.

Fixes containers#13394

Signed-off-by: Paul Holzinger <[email protected]>
@zowoq
Copy link
Author

zowoq commented Mar 23, 2022

Is it not possible to have it always use the qemu that is set by CONTAINERS_HELPER_BINARY_DIR?

@zowoq
Copy link
Author

zowoq commented Mar 24, 2022

Is it not possible to have it always use the qemu that is set by CONTAINERS_HELPER_BINARY_DIR?

@Luap99 Should I open a new issue for this?

@Luap99
Copy link
Member

Luap99 commented Mar 24, 2022

Well the idea is to keep the behaviour as @baude said. First use the path which was used for init.

Isn't this something you could patch downstream, this problems looks very nixos specific.

@zowoq
Copy link
Author

zowoq commented Mar 24, 2022

I don't see how an env var being disregarded on subsequent invocations of podman machine is a nixpkgs specific issue.

We'd have to carry and rebase a patch forever for something that seems like it would be a relatively simple fix upstream? e.g. If QemuCommand exists in CONTAINERS_HELPER_BINARY_DIR use it otherwise continue with the current behaviour?

@Luap99
Copy link
Member

Luap99 commented Mar 24, 2022

I understand that.
The problem with using CONTAINERS_HELPER_BINARY_DIR is that this was mainly added for testing purposes.
The env var is not documented. I would never expected someone to use it outside of this context. I have no problem with you using it for this, it is just that relying on an undocumented implementation details is often not a good idea.
So the first step would be to document CONTAINERS_HELPER_BINARY_DIR as official way to set this directory.

To this specific issue, are you proposing to first check the env var, use qemu from there when available, if not use stored path from config. If still not found fall back to full helper binaries in containers.conf and path? I guess that would work but it is a bit ugly because we would need to check the env var extra.

To me this looks like a general problem with nix, if you keep you old qemu around why should it use the new one at a different path. I am really missing the concept here, why would the same package be installed with different version at different paths.

Also note that FindHelperBinary() is called with true as second argument so you can already use $PATH for qemu.

@zowoq
Copy link
Author

zowoq commented Mar 25, 2022

The problem with using CONTAINERS_HELPER_BINARY_DIR is that this was mainly added for testing purposes. The env var is not documented. I would never expected someone to use it outside of this context. I have no problem with you using it for this, it is just that relying on an undocumented implementation details is often not a good idea. So the first step would be to document CONTAINERS_HELPER_BINARY_DIR as official way to set this directory.

TBH I'd be very happy if CONTAINERS_HELPER_BINARY_DIR/FindHelperBinary and all the other preconfigured paths here and in containers/common went away and podman just looked for everything in $PATH. (cni plugins are probably the only ones that need any special handling.)

To this specific issue, are you proposing to first check the env var, use qemu from there when available, if not use stored path from config. If still not found fall back to full helper binaries in containers.conf and path?

Yes, that would work.

To me this looks like a general problem with nix, if you keep you old qemu around why should it use the new one at a different path. I am really missing the concept here, why would the same package be installed with different version at different paths.

We aren't choosing to keep it around and it isn't actually installed, /nix/store is the users package cache. Really abbreviated explanation is that nix creates symlinks from /nix/store into a directory which is in the users $PATH, whichever qemu is symlinked is the one that is installed and should be used.

Also note that FindHelperBinary() is called with true as second argument so you can already use $PATH for qemu.

Afraid I don't understand what you mean with this.

@Luap99
Copy link
Member

Luap99 commented Mar 25, 2022

Also note that FindHelperBinary() is called with true as second argument so you can already use $PATH for qemu.

Afraid I don't understand what you mean with this.

The function already looks in $PATH so I am not sure why you use CONTAINERS_HELPER_BINARY_DIR, that doesn't change the fact that it will still record the full path on init. So if this is already symlinked it should just work?

@zowoq
Copy link
Author

zowoq commented Mar 25, 2022

The function already looks in $PATH so I am not sure why you use CONTAINERS_HELPER_BINARY_DIR, that doesn't change the fact that it will still record the full path on init.

$PATH not working because of recording the full path is why I originally opened the issue. I would like for CONTAINERS_HELPER_BINARY_DIR to work as we discussed above:

first check the env var, use qemu from there when available, if not use stored path from config.

So if this is already symlinked it should just work?

No it does not work. That is why I'm asking for this:

first check the env var, use qemu from there when available, if not use stored path from config.

mheon pushed a commit to mheon/libpod that referenced this issue Mar 30, 2022
We store the full path to qemu in the machine config. When the path
changes on the host the machine can longer be started. To fix it we get
the path again when we fail to start the machine due the missing binary.

We want to store and use the full path first because otherwise existing
machines can break when the qemu version changed.

[NO NEW TESTS NEEDED] We still have no machine tests.

Fixes containers#13394

Signed-off-by: Paul Holzinger <[email protected]>
@zaphar
Copy link

zaphar commented Apr 17, 2022

If there was some way for us to tell podman where qemus edk2 path was so non standard install setups will still function that would be great. Right now podman on darwin at least assumes that you have either installed qemu from source with standard install locations or that you installed via non-local brew.

I have in the past done local brew installs as well as installed stuff in local only paths. podman won't work in those circumstances.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine macos MacOS (OSX) related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants