Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toolbox: init at 0.0.99 #110473

Closed
wants to merge 1 commit into from
Closed

toolbox: init at 0.0.99 #110473

wants to merge 1 commit into from

Conversation

mjlbach
Copy link
Contributor

@mjlbach mjlbach commented Jan 22, 2021

Motivation for this change

Closes #96115: Toolbox is a container tool, that gives an "escape" hatch for immutable systems like nix (and Silverblue).

Pending upstream patches: containers/toolbox#675 containers/toolbox#676 Edit: have been merged!

I also need to not hardcode the glibc nix store path.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@ofborg ofborg bot added the 8.has: package (new) This PR adds a new package label Jan 22, 2021
@ofborg ofborg bot requested review from kalbasit and mt-caret January 22, 2021 12:33
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 10.rebuild-linux: 1 labels Jan 22, 2021
Copy link
Contributor

@zowoq zowoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @NixOS/podman

pkgs/applications/virtualization/toolbox/default.nix Outdated Show resolved Hide resolved
pkgs/applications/virtualization/toolbox/default.nix Outdated Show resolved Hide resolved
pkgs/applications/virtualization/toolbox/default.nix Outdated Show resolved Hide resolved
@mjlbach mjlbach force-pushed the init_toolbox branch 3 times, most recently from a2b54f2 to f4ce744 Compare January 22, 2021 13:14
@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 22, 2021

Last couple thoughts:

  • If anyone else could test this, that would be great! It seems to work on my system fine, with the exception that the container complains about zsh missing Error: command /run/current-system/sw/bin/zsh not found in container fedora-toolbox-32, which makes sense, because we're not injecting that into the container.

  • I didn't know if I should put the podman team as maintainers, I'm happy to add myself to the team and take care of toolbox.

  • Do we wait until the upstream patch is merged (probably?)

For reference, these are my pertinent nixos settings:

virtualisation.podman = {
    enable = true;
    enableNvidia = true;
  };

  xdg.portal.enable = true;

@ofborg ofborg bot added the 11.by: package-maintainer This PR was created by the maintainer of the package it changes label Jan 22, 2021
@mjlbach mjlbach marked this pull request as ready for review January 22, 2021 13:31
@mjlbach mjlbach mentioned this pull request Jan 22, 2021
@zowoq
Copy link
Contributor

zowoq commented Jan 22, 2021

I didn't know if I should put the podman team ...

I wouldn't add it for now, we can decide on that later.

Do we wait until the upstream patch is merged (probably?)

Yeah, I'd like to wait at least until we see upstream is okay with the patch.

@tfmoraes
Copy link
Contributor

* If anyone else could test this, that would be great! It seems to work on my system fine, with the exception that the container complains about zsh missing `Error: command /run/current-system/sw/bin/zsh not found in container fedora-toolbox-32`, which makes sense, because we're not injecting that into the container.

This problem also happens in Fedora if the user shell is not Bash.

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 22, 2021

@tfmoraes What is the case where /mnt won't exist? that requires this patch. I'll try to upstream it, but I'll need justification.

diff --git a/src/cmd/create.go b/src/cmd/create.go
index 74e90b1..f977234 100644
--- a/src/cmd/create.go
+++ b/src/cmd/create.go
@@ -318,13 +318,15 @@ func createContainer(container, image, release string, showCommandToEnter bool)
  var mntLink []string
  var mntMount []string

- mntPath, _ := filepath.EvalSymlinks("/mnt")
- if mntPath == "/var/mnt" {
-   logrus.Debug("/mnt is a symbolic link to /var/mnt")
-   mntLink = []string{"--mnt-link"}
- } else {
-   mntMount = []string{"--volume", "/mnt:/mnt:rslave"}
- }
+ if utils.PathExists("/mnt") {
+    mntPath, _ := filepath.EvalSymlinks("/mnt")
+    if mntPath == "/var/mnt" {
+      logrus.Debug("/mnt is a symbolic link to /var/mnt")
+      mntLink = []string{"--mnt-link"}
+    } else {
+      mntMount = []string{"--volume", "/mnt:/mnt:rslave"}
+    }
+  }

  var runMediaMount []string

@mjlbach mjlbach force-pushed the init_toolbox branch 2 times, most recently from 2a48a98 to 5640d99 Compare January 22, 2021 14:05
@tfmoraes
Copy link
Contributor

@mjlbach my system doesn't have /mnt, I don't know why.

@tfmoraes
Copy link
Contributor

@mjlbach tested here. It's working! The only problem is that sudo doesn't have access to network, see:

$ ping google.com # normal user
PING google.com (64.233.188.101) 56(84) bytes of data.
64 bytes from tk-in-f101.1e100.net (64.233.188.101): icmp_seq=1 ttl=101 time=287 ms
64 bytes from tk-in-f101.1e100.net (64.233.188.101): icmp_seq=2 ttl=101 time=286 ms
64 bytes from tk-in-f101.1e100.net (64.233.188.101): icmp_seq=3 ttl=101 time=287 ms

$ sudo ping google.com # super user
ping: google.com: Temporary failure in name resolution

Because of that I can't use dnf to install things.

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 22, 2021

@mjlbach tested here. It's working! The only problem is that sudo doesn't have access to network, see:

Because of that I can't use dnf to install things.

That's odd, I don't have that issue on my system.

@tfmoraes
Copy link
Contributor

I think it's because /etc/resolv.conf in my toolbox container is a link to a file that not exists inside the container.

$ ls -l /etc/resolv.conf 
lrwxrwxrwx 1 root root 32 Jan 22 17:39 /etc/resolv.conf -> /run/host/etc/static/resolv.conf

$ ls -l /run/host/etc/static/resolv.conf
ls: cannot access '/run/host/etc/static/resolv.conf': No such file or directory

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 22, 2021

Oh weird, my symlink is correct.

@tfmoraes
Copy link
Contributor

I fixed my case with this little patch https://gist.github.com/tfmoraes/583bb66269216768fbe72b4e57cdd221

In my system /etc/resolv.conf is a link:

❯ ls -l /etc/resolv.conf                                                                                                          
lrwxrwxrwx 1 root root 23 jan 22 15:06 /etc/resolv.conf -> /etc/static/resolv.conf

❯ ls -l /etc/static/resolv.conf                                                                                                   
lrwxrwxrwx 6 root root 37 dez 31  1969 /etc/static/resolv.conf -> /run/systemd/resolve/stub-resolv.conf

And /etc/static is also a link /nix/store/....-etc/etc. The problem is this /nix/store/...-etc/etc is updated when I update the system. I think it's problem, because after this update (and followed of a nix-collect-garbage) this folder will not exist and will not mounted inside the container.

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 24, 2021

@mjlbach I'm having the same problem when using KDE over X11 in Arch. If I use KDE over Wayland this problem doesn't happen, even if I remove .Xauthority file. My xauth and display:

$ xauth list
archo/unix:0  MIT-MAGIC-COOKIE-1  xxxxxxxxxxxxxxxxxxxxxxxxx

$ echo $DISPLAY 
:0

Yeah, it's an upstream issue. You can follow it here: https://github.com/containers/toolbox/issues

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 25, 2021

Ok. Figured out the container KDE sharing issue. It's actually an SDDM issue.

SDDM, unlike GDM, doesn't use FamilyWild type xauth entries, so Xauth is sensitive to the hostname. This is why it works on gnome.

There's this upstream PR for SDDM which would fix the issue here: sddm/sddm#1230

Fedora applies this patch to SDDM (same author): https://src.fedoraproject.org/rpms/sddm/blob/master/f/0001-Redesign-Xauth-handling.patch to use FamilyWild, I'm not sure if this causes enough issues/inconvenience to justify a patch in nixpkgs as well.

I can confirm applying this patch to our sddm, that the problem is resolved as well.

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 25, 2021

Also, realized one annoyance with this approach is I believe you will have to re-create the container everytime the glib patch changes, since that's added to the container at build-time :( One option would be for users to just pin the nixpkgs commit, the other might be to create a nixos module that updates all toolbox containers to point to the latest glibc on each rebuild.

I could also try to patch the flags set to podman at runtime to always point to the latest version of glib.

@tfmoraes
Copy link
Contributor

Run Toolbox inside a FHS Env is not a option?

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 25, 2021

No, the problem is the bind mount info that's stored inside the container points to both toolbox and glibc, both of which paths will change on system upgrade.

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zowoq
Copy link
Contributor

zowoq commented Jan 27, 2021

Doing a non CGO build would seem to be the simplest solution, upstream was considering it themselves at one point but they went a different way. containers/toolbox#531 (review)

@mjlbach
Copy link
Contributor Author

mjlbach commented Jan 27, 2021

Doing a non CGO build would seem to be the simplest solution, upstream was considering it themselves at one point but they went a different way. containers/toolbox#531 (review)

That's true, but we still link the store path to toolbox into the container at build time. I have a patch which reads the store link to glibc and toolbox from an environmental variable (which toolbox would be wrapped with) at runtime, which I think is the best option?

If we don't build with glibc, we could copy toolbox into the container at build time, but it would still require a patch then to remove the volume mount to the toolbox binary when building the container.

@debarshiray
Copy link

Looks like it may need https://github.com/flatpak/flatpak-xdg-utils which we
don't have packaged, not sure how useful it would be on NixOS or how much
work it would be to to add/maintain.

Toolbox only requires the flatpak-spawn tool from flatpak-xdg-utils, and that too only inside the Toolbox container itself. Not on the host operating system.

The idea is to enable using toolbox(1) from inside a Toolbox container. eg., toolbox create can be called from inside a container. It works by using flatpak-spawn inside the container to forward the invocation to the host over D-Bus to the flatpak-session-helper service running on the host. The combination of flatpak-spawn and flatpak-session-helper wires the file descriptors in a way to make it look as if the command ran as usual, but in reality it actually ran on the host.

Note:

  • This doesn't work (yet?) for rootful Toolbox containers because the session initiated by the likes of sudo doesn't have a D-Bus daemon running. So there's no communication channel to set up the host forwarding.

  • This isn't a hard requirement. If either flatpak-spawn is missing from the Toolbox container, or flatpak-session-helper is missing from the host, invoking toolbox(1) inside the container will fail. Other than that, things will continue to work as usual.

@mjlbach
Copy link
Contributor Author

mjlbach commented Feb 20, 2021

@debarshiray (thank you for your work!), responding to your comment on the issue re: glibc here.

I'm not sure if you're familiar with nix, but it's a little strange in that when we compile toolbox, the toolbox binary has a hardcoded path to glibc which points to a non-fhs compliant location (under /nix/store/hash-unique-to-version-glibc), which is also where toolbox is located (at /nix/store/hash-unique-to-version-toolbox). To allow the toolbox binary to run, I have to bind-mount both of these volumes upon creation of a new toolbox , the path to the toolbox binary is handled automatically, but the glibc path which I do so with an in-tree nixpkgs specific patch here:

https://github.com/NixOS/nixpkgs/pull/110473/files#diff-3d839f5910980c5b63dc9d0ca514bf81f91980f65943ae42285a1d81788cadcaR9-R18

Now we don't have the glibc or toolbox issue, and everything works perfectly!

However, there is one big caveat. When the nix version of glibc and toolbox update, the bind mount that is mapped to these locations is no longer valid, as those locations in the nix store can be garbage collected at will. This means after (most) system upgrades, all toolbox containers will break without users manually updating the bind mount paths to both glibc and toolbox (since the hash-unique-to-version has changed). I've investigated a couple solutions, namely wrapping the toolbox start command to automatically update these paths on each run which is hacky but I think will work, but was curious if you had any thoughts.

Solving this issue and upstreaming containers/toolbox#675 are the only two roadblocks to merging this (I believe) since you fixed and merged my other upstream pr.

Also thank you so much for you work on toolbox, and stopping by to see how our packaging was going! :)

@debarshiray
Copy link

debarshiray commented Feb 22, 2021

I'm not sure if you're familiar with nix,

No, I wasn't aware of the inner workings of Nix. So, thanks for the explanation!

I should probably set up a NixOS VM to play with.

it's a little strange in
that when we compile toolbox, the toolbox binary has a hardcoded
path to glibc which points to a non-fhs compliant location
(under /nix/store/hash-unique-to-version-glibc), which is also where
toolbox is located (at /nix/store/hash-unique-to-version-toolbox). To
allow the toolbox binary to run, I have to bind-mount both of these
volumes upon creation of a new toolbox

Normally, on a NixOS host, how does the toolbox(1) binary work? Is /nix/store/hash-unique-to-version-toolbox) bind mounted as /usr/bin/toolbox? What about glibc's shared object?

the path to the toolbox binary
is handled automatically, but the glibc path which I do so with an
in-tree nixpkgs specific patch

What do you mean by the toolbox binary is handled automatically?

Solving this issue and upstreaming containers/toolbox#675 are
the only two roadblocks to merging this (I believe) since you fixed
and merged my other upstream pr.

Barring details, I think I agree with the overall direction of that PR. I am busy with some non-Toolbox things right now. I will get to it as soon as I can switch back.

@mjlbach
Copy link
Contributor Author

mjlbach commented Feb 22, 2021

Normally, on a NixOS host, how does the toolbox(1) binary work? Is /nix/store/hash-unique-to-version-toolbox) bind mounted as /usr/bin/toolbox? What about glibc's shared object?

There is a symlink ~/.nix-profile that points to your current active profile, which is similar to a system image after a transactional update in silverblue. This is located somewhere in /nix/var like /nix/var/nix/profiles/per-user/michael/profile/bin which contains links to all installed binaries in the current profile, wuch as /nix/store/hash-unique-to-version-toolbox. In practice, the nix-installer puts ~/.nix-profile/bin on your PATH, and when you launch toolbox launches, it's redirected transparently to the nix store location.

For reference, when toolbox is installed on my system:

❯ ls -alt ~/.nix-profile/bin/toolbox
lrwxrwxrwx 2 root root 70 Dec 31  1969 /home/mjlbach/.nix-profile/bin/toolbox -> /nix/store/dv0vwwx108sq52d9q7qanpzbbl2vjfnl-toolbox-0.0.99/bin/toolbox

And the binds into the toolbox container. Note the hashes will change on every upgrade of glibc or toolbox (or if any of their input dependencies change due to the design of nix addressing derivations by their inputs rather than outputs) leading to a dead link into the container, which is the last remaining problem I'm mulling over.

        "HostConfig": {
            "Binds": [
                "devpts:/dev/pts",
                "/nix/store/m0xa5bz7vw7p43wi0jppvvi3c9vgqvp7-glibc-2.32-25:/nix/store/m0xa5bz7vw7p43wi0jppvvi3c9vgqvp7-glibc-2.32-25:ro,rprivate,rbind",
                "/dev:/dev:rslave,rw,nosuid,rbind",
                "/var:/run/host/var:rslave,rw,rbind",
                "/run/user/1000:/run/user/1000:rw,rprivate,nosuid,nodev,rbind",
                "/run:/run/host/run:rslave,rw,nosuid,nodev,rbind",
                "/boot:/run/host/boot:rslave,rw,rbind",
                "/tmp:/run/host/tmp:rslave,rw,rbind",
                "/run/dbus/system_bus_socket:/run/dbus/system_bus_socket:rw,rprivate,nosuid,nodev,rbind",
                "/etc:/run/host/etc:rw,rprivate,rbind",
                "/home/mjlbach:/home/mjlbach:rslave,rw,rbind",
                "/usr:/run/host/usr:rw,rslave,rbind",
                "/nix/store/dv0vwwx108sq52d9q7qanpzbbl2vjfnl-toolbox-0.0.99/share/profile.d/toolbox.sh:/etc/profile.d/toolbox.sh:ro,rprivate,rbind",
                "/nix/store/dv0vwwx108sq52d9q7qanpzbbl2vjfnl-toolbox-0.0.99/bin/toolbox:/usr/bin/toolbox:ro,rprivate,rbind",
                "/mnt:/mnt:rslave,rw,rbind"
            ],

Normally, on a NixOS host, how does the toolbox(1) binary work? Is /nix/store/hash-unique-to-version-toolbox) bind mounted as /usr/bin/toolbox? What about glibc's shared object? What do you mean by the toolbox binary is handled automatically?

Yes, /nix/store/hash-unique-to-version-toolbox is mounted as /usr/bin/toolbox and /nix/store/hash-unique-to-version-glibc is mounted as /nix/store/hash-unique-to-version-glibc due to my patch. This is done, because toolbox has the store path to glibc hardcoded into the binary. When you run podman inspect on the container image, you can see the store paths for both of these.

Barring details, I think I agree with the overall direction of that PR. I am busy with some non-Toolbox things right now. I will get to it as soon as I can switch back.

Terrific, and I'm happy to help out as much as possible. Getting toolbox support for nixos will be a huge boon for nixos users.

I should probably set up a NixOS VM to play with.

NixOS is very fun to play around with, and it's very cool to see how silverblue/nix/guix are all tackling packaging problems :)

@mjlbach
Copy link
Contributor Author

mjlbach commented Mar 10, 2021

Upstream patches have been merged! 🥳

Once a new release is cut, I'll go ahead and update the version, remove the patches from this PR, and implement a solution for linking glibc/toolbox into the container such that it will be updated when the store paths change.

@stale
Copy link

stale bot commented Sep 6, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 6, 2021
@shadowrylander
Copy link

This is very important to me! 😹

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Nov 15, 2021
@mjlbach
Copy link
Contributor Author

mjlbach commented Nov 15, 2021

I'm probably not going to finish this as I don't use nixos anymore

@shadowrylander
Copy link

So close! 😹 Neither do I, but I thought for nix itself it may prove useful! Otherwise I'd have to install it from the distro's official repositories. Which is fine, but I enjoy the uniformity of using a single package manager.

@mjlbach
Copy link
Contributor Author

mjlbach commented Nov 15, 2021

This will likely never work reliably with your distro, as the last think I had left to do was add a systemd service to update all containers glibc bind mounts on rebuild (or patch toolbox to do so)

@shadowrylander
Copy link

Ah; makes sense. I'm running Ubuntu for the ZFS root, so I guess I'll just have to use apt. Ah, well; thanks for all the effort!

@FedX-sudo
Copy link

I would be willing to bring this up to speed. I agree that toolbox is an amazing tool, and I would love to have access to it on Nix.

Comment on lines +41 to +43
buildFlagsArray = [
"-ldflags=-X github.com/containers/toolbox/pkg/version.currentVersion=${version}"
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
buildFlagsArray = [
"-ldflags=-X github.com/containers/toolbox/pkg/version.currentVersion=${version}"
];
ldflags = [
"-X github.com/containers/toolbox/pkg/version.currentVersion=${version}"
"-s" "-w"
];

@SuperSandro2000
Copy link
Member

@ofborg eval

@zowoq
Copy link
Contributor

zowoq commented Feb 17, 2022

Closing as the author stopped using nix. Thanks for working on this @mjlbach.

If people want to see this packaged here I'd suggest working with upstream on making this a static binary (they have an open issue for this), that should resolve the last hurdle in getting this packaged.

If someone really wants to try hacking around the glibc store path issue in the meantime you can open a PR and we can take a look but TBH I'm not really in favour of having a complex and/or fragile workaround for this.

@zowoq zowoq closed this Feb 17, 2022
@FedX-sudo
Copy link

FedX-sudo commented Feb 17, 2022

Thanks for the work on this PR anyway. It would be very nice to see this on NixOS.

If anyone is looking for a tool like Toolbox that works (mostly) on NixOS, check out Distrobox. I have been working upstream to improve support on NixOS with Distrobox, but I have yet to get it packaged, and if I remember correctly, I was seeing the same glibc bug with Distrobox. However, since it's a bash script, it works in .local/bin, which while less than ideal long-term, it does give you the same functionality.

(edit):
looked back at my work, and it looks like a similar issue to glibc, but not the same issue. I think I now how to fix it, and I will be doing some experimentation latter today.

@FedX-sudo
Copy link

Found the issue. Opening a PR for distrobox. #160485

@urandom2 urandom2 mentioned this pull request Dec 1, 2022
13 tasks
SuperSandro2000 added a commit that referenced this pull request Dec 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.has: package (new) This PR adds a new package 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 10.rebuild-linux: 1 11.by: package-maintainer This PR was created by the maintainer of the package it changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Toolbox
10 participants