Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos-rebuild on proxmox-lxc container fails with busctl error, and causes other configuration change from base image #319

Open
smacz42 opened this issue Mar 17, 2024 · 4 comments · Fixed by NixOS/nixpkgs#328682

Comments

@smacz42
Copy link

smacz42 commented Mar 17, 2024

Summary

Running nixos-rebuild on a customized (or vanilla) proxmox-lxc image is unsuccessful, renders the container into an unmanageable state, and some configuration seems to be removed.

Steps to reproduce:

  1. Build the container
    a. nix run --extra-experimental-features nix-command --extra-experimental-features flakes github:nix-community/nixos-generators --format proxmox-lxc -c /tmp/firstboot.nix
    b. cat << EOF > /tmp/firstboot.nix
{ config, pkgs, ... }:

{
  # Set up a systemd service
  systemd.services.startup = {
    description = "Sets up the NixOS container on startup";
    wantedBy = [ "multi-user.target" ];
    script = "echo 'Hello World'"
  }
}
EOF
  1. Run the container in Proxmox
  2. Create a minimal configuration.nix:
{config, pkgs, ... }:

{
  imports = [ <nixpkgs/nixos/modules/virtualisation/lxc-container.nix> ];

  environment.variables = {
    HISTFILESIZE = "";
    HISTSIZE = "";
    HISTTIMEFORMAT = "%F %T ";
    NIX_SSL_CERT_FILE = "/etc/ssl/certs/ca-certificates.crt";
  };

  systemd.mounts = [{
    where = "/sys/kernel/debug";
    enable = false;
  }];

  environment.systemPackages = with pkgs; [
    vim
    binutils
  ];
}
  1. Run nixos-rebuild test
'/nix/store/zgzrbba39fsn341s5dyl89wi7cdavsf0-system-path/bin/busctl --json=short call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager ListUnitsByPatterns asas 0 0' exited with value 1 at /nix/store/nhys1a2wsn5x5xm5bv5msk6ynqrhya4q-nixos-system-nixos-23.11.5408.8ac30a39abc5/bin/switch-to-configuration line 145.

a. This actually borks the system the exact same as a switch would, even though it's only a test.
5. Re-run nixos-rebuild switch

building Nix...
building the system configuration...
trace: warning: system.stateVersion is not set, defaulting to 23.11. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
stopping the following units: network-local-commands.service, systemd-networkd-wait-online.service, systemd-networkd.service, systemd-networkd.socket, systemd-resolved.service
activating the configuration...
setting up /etc...
removing obsolete symlink ‘/etc/resolv.conf’...
removing obsolete symlink ‘/etc/man_db.conf’...
removing obsolete symlink ‘/etc/systemd/networkd.conf’...
removing obsolete symlink ‘/etc/systemd/resolved.conf’...
restarting systemd...
Failed to list users: Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Unable to close the file handle to loginctl at /nix/store/nhys1a2wsn5x5xm5bv5msk6ynqrhya4q-nixos-system-nixos-23.11.5408.8ac30a39abc5/bin/switch-to-configuration line 890.
warning: error(s) occurred while switching to the new configuration

Issues

Pre-reboot:

  1. busctl issue above as output of nixos-rebuild
  2. Cannot shutdown as root in container:
Failed to set wall message, ignoring: Access denied
Call to Reboot failed: Access denied
  1. dbus errors in logs:
dbus-daemon[280]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.5" (uid=0 pid=16558 comm="systemd-run -E LOCALE_ARCHIVE -E NIXOS_INSTALL_BOO" label="unconfined") interface="org.freedesktop.systemd1.Manager" member="StartTransientUnit" error name="(unset)" requested_reply="0" destination="org.freedesktop.systemd1" (uid=0 pid=1 comm="/run/current-system/systemd/lib/systemd/systemd" label="unconfined")
dbus-daemon[280]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.6" (uid=0 pid=16563 comm="/nix/store/zgzrbba39fsn341s5dyl89wi7cdavsf0-system" label="unconfined") interface="org.freedesktop.systemd1.Manager" member="ListUnitsByPatterns" error name="(unset)" requested_reply="0" destination="org.freedesktop.systemd1" (uid=0 pid=1 comm="/run/current-system/systemd/lib/systemd/systemd" label="unconfined")
dbus-daemon[280]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.7" (uid=0 pid=16571 comm="shutdown -r now" label="unconfined") interface="org.freedesktop.login1.Manager" member="SetWallMessage" error name="(unset)" requested_reply="0" destination="org.freedesktop.login1" (uid=0 pid=291 comm="/nix/store/dzp7d4k1d94s1x49p9171mvcsfyxr7bj-system" label="unconfined")
dbus-daemon[280]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.7" (uid=0 pid=16571 comm="shutdown -r now" label="unconfined") interface="org.freedesktop.login1.Manager" member="RebootWithFlags" error name="(unset)" requested_reply="0" destination="org.freedesktop.login1" (uid=0 pid=291 comm="/nix/store/dzp7d4k1d94s1x49p9171mvcsfyxr7bj-system" label="unconfined")

Post-reboot:

  1. Hostname is reset to nixos
    a. This does not happen if I use the vanilla image from hydra.
  2. Custom systemd services disappear
    a. This is probably because it's not defined in the configuration.nix file that got rebuilt against, but I wasn't expecting this behavior.
  3. Odd changes when running nixos-rebuild switch again:
setting up /etc...
removing obsolete symlink ‘/etc/resolv.conf’...
removing obsolete symlink ‘/etc/man_db.conf’...
removing obsolete symlink ‘/etc/systemd/resolved.conf’...
removing obsolete symlink ‘/etc/systemd/networkd.conf’...
setting up tmpfiles
Cannot set file attributes for '/var/empty', value=0x00000010, mask=0x00000010, ignoring: Operation not permitted
reloading the following units: dbus.service
restarting the following units: nix-daemon.service
starting the following units: network-local-commands.service
the following new units were started: dhcpcd.service, network-setup.service, resolvconf.service

Suspicions

I suspect this is because:

  1. The /etc/nixos/configuration.nix is overriding whatever the build was built with, which disables everything that the container was built with (including hostname.)
  2. I also suspect that the installation of glibc and whatever else gets installed causes an error for the (unprivileged) container restarting systemd services.
    a. I'm not sure, but this might have something to do with: nixos-rebuild ready image #86

I would expect (without having my understanding of the internals of nixos) to be able to take the base image of the container, and to create a /etc/nixos/configuration.nix and run nixos-rebuild switch that does not break and/or modify the configuration of the base image, or have a way in which to include the configuration of the base image in such a way as to preserve the existing configuration.

I'm happy to do any further testing required in regards to these issues :)

@smacz42
Copy link
Author

smacz42 commented Mar 23, 2024

Just to follow-up... are we not supposed to be able to run nixos-rebuild on these images? I guess I don't understand. Basically, is my "I would expect..." line above inaccurate?

@mayl
Copy link
Contributor

mayl commented Mar 24, 2024

I don't use proxmox or lxc so I can't comment on any of that specifically.

Broadly, nixos configurations as part of being declarative and reproducible, need to be complete. The result you get running nixos-rebuild is a function only of the config you pass, not the state of the current system. I describe this to say the expectation of a merge with current state you seem to describe is not a good mental model for what happens.

I suspect if you copy your config into the container and rebuild from that it could work, but I can't speak too much to that either. The way I use nixos-generators is to specify the full config up front, and rebuild a new container if I need changes. Except I mostly build vms, not containers :⁠-⁠)

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-proxmox-lxc-not-rebuilding-using-wiki-provided-configuration/47104/8

@hogcycle
Copy link

hogcycle commented Jul 4, 2024

Glad to see I'm not the only one having issues. So is this a feature, not a bug? Should I be doing all of my configuration and packing it into the tarball? It seems unnecessary to have to do that for even the smallest of changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants