Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Live environment with nix store included in initrd #14538

Closed
nshalman opened this issue Apr 8, 2016 · 20 comments
Closed

RFC: Live environment with nix store included in initrd #14538

nshalman opened this issue Apr 8, 2016 · 20 comments
Labels
9.needs: reporter feedback This issue needs the person who filed it to respond

Comments

@nshalman
Copy link
Member

nshalman commented Apr 8, 2016

Issue description

I'm working on a new Linux based project that is currently using a different build system and I'm evaluating Nix as a replacement.

Our system runs mostly from a stripped down Linux image in memory that includes the full system in the initrd which makes it very very simple to boot with network booting. No need to fetch an additional squashfs (and it's practical because our image is so lean.) We then put a ZFS pool on the disks to use as storage for the workloads we run on the machines.

I've been flailing around (see master...nshalman:cerana-test1) trying to get something that works and coming up short.

I think Nix/NixOS is a very good project for us to collaborate with. A much more elegant version of what I've been working on is something that I'd want to get merged into nixpkgs so that we can collaborate further with NixOS.

@dezgeg
Copy link
Contributor

dezgeg commented Apr 8, 2016

Maybe in this part:

+    # Individual files to be included
+    cerana.contents =
...
+        { source = config.system.build.initialRamdisk + "/initrd";
+          target = "/boot/initrd";
+        }
...
+      ];
+

instead of using config.system.build.initialRamdisk directly, you create a new initrd based on that.

@bobvanderlinden
Copy link
Member

Thanks for the work! I still have to give this a try, but this looks promising. Will try tomorrow.

Also, related to #2100.

@bobvanderlinden
Copy link
Member

I've made some changes. Squashfs now mounts, but Stage 2 cannot be started because it cannot find init.

https://github.com/bobvanderlinden/nixpkgs/tree/cerana-test1

To build:

nix-build -A cerana_minimal nixos/release.nix

This results in the 3 files:

result/bzImage
result/initrd
result/kernelAppend

To run inside qemu:

qemu-system-x86_64 -kernel result/bzImage -initrd result/initrd -m 2G -nographic -serial mon:stdio -append "$(cat result/kernelAppend) console=ttyS0"

In stage 1 it has /, which is the contents of initrd and /mnt-root, which will be / for stage 2. It mounts squashfs from initrd into /mnt-root/nix. After that stage-1-init.sh will switch_root into /mnt-root and execute the stage 2 init script.

The problem is (I think) that switch_root deletes all files from its current root (from initrd). That means the squashfs file will also be deleted. I don't know for sure whether this is the case, but somehow it doesn't find init of stage 2. That made me think, why not skip squashfs and put everything in initrd directly? That way switch_root isn't necessary anymore.

Not sure whether this is the case, but I couldn't figure it out yet.

@dezgeg
Copy link
Contributor

dezgeg commented Apr 9, 2016

Yes, switch_root deleting everything in the initrd sounds correct (as in it should do that by design). But I think it shouldn't matter that the squashfs image gets deleted as accessing unlinked files that are already open should work just fine in Linux. Probably something else is amiss.

@bobvanderlinden
Copy link
Member

@dezgeg Thanks for the tips, it was an incorrect path for the mountpoint :/

Anyway, it's now booting correctly in Qemu. Not sure whether it works the same for iPXE.

@copumpkin copumpkin added the 9.needs: reporter feedback This issue needs the person who filed it to respond label Apr 11, 2016
@nshalman
Copy link
Member Author

The next phase I was going to work on was actually removing use of the squashfs and just putting the /nix contents directly into the initrd. @bobvanderlinden, do you think that would be possible?

@dezgeg
Copy link
Contributor

dezgeg commented Apr 11, 2016

Not sure if that's worth the complexity of having extra code for something that's used really rarely.

@bobvanderlinden
Copy link
Member

@nshalman I have thought of this as well. I'm not sure, it shortens the build time (creating squashfs + creating initrd takes a bit of IO), but it seems from reading online that squashfs is still efficient in initrd. That said, I gave it a short try, but due to switch_root removing files from initrd it didn't immediately work.

I'd go for cleaning things up, de-branding the code and making it ready for inclusion in NixOS

@dezgeg I agree that the extra complexity might not be worth it, but do not underestimate the usage of @nshalman's changes. Having a separate kernel+initrd will allow various types of iPXE/PXE booting. That will allow netboot.xyz support as well as somewhat proper support for cloud providers like Digital Ocean (no need for nixos-in-place or nixos-assimilate).
If Hydra would generate these, like the ISOs, it'll be another medium for installation.

@nshalman
Copy link
Member Author

Naming things is hard. Does anyone have suggestions of naming to use for removing the "cerana" stuff from the code? PXE or netboot are logical options, I guess.

@bobvanderlinden
Copy link
Member

Even though it's just a standalone initrd, I guess netboot is a good one, since it'll be mostly used for networking booting (whether PXE or iPXE or grub?). A .ipxe file and/or PXE services can be created in a second iteration.

@nshalman
Copy link
Member Author

@bobvanderlinden I have a de-branded version of this work in master...nshalman:netboot-v1
That commit doesn't currently credit you; please let me know if/how you want to be credited in a version to be turned into a pull request.

For my purposes I definitely need to do further work to see how easy it would be to eliminate the use of squashfs. I'm guessing that if built correctly my variant wouldn't need to ever invokeswitch_root in the first place.

Testing of that netboot-v1 branch is very welcome. Further tweaks to it (e.g. moving the nouveau blacklisting from netboot.nix to netboot-minimal.nix) are also welcome.

@bobvanderlinden
Copy link
Member

Nice. Also, don't worry about crediting.

I tried to boot this today over iPXE in qemu, however I couldn't make it work. I haven't done much with iPXE before, so this might be a problem on my side. That said, it's worth checking whether it actually boots over the network.
Here are the steps I used:

  • Build the bzImage, initrd files.
  • Create boot directory.
  • Create boot/boot.ipxe with:
#!ipxe
kernel bzImage init=/nix/store/3zrg47j9ihydsn026qzpwf7fs4ncx8f8-nixos-system-nixos-16.09pre56789.gfedcba/init loglevel=7
initrd initrd
boot
  • Symlink bzImage and inird into boot/
  • Host boot/ over HTTP (I used darkhttpd).
  • Start qemu using -cdrom ipxe.iso and -m 2G
  • In qemu I pressed Ctrl+D and used: dhcp followed by chain --autofree http://192.168.0.197:8080/boot.ipxe.

It loads the kernel, it loads initrd into memory, it boots the kernel, it unpacks initrd, but then it fails on executing init with the following error:

                   [   22.824394] Kernel panic - not syncing: Requested init /nix/store/3zrg47j9ihy
                   dsn026qzpwf7fs4ncx8f8-nixos-system-nixos-16.09pre56789.gfedcba/init failed (erro
                   r -2).
                   [   22.824606] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.6 #1-NixOS
                   [   22.824660] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1
                   -20160216_104851-anatol 04/01/2014
                   [   22.824811]  0000000000000000 ffff880062c57eb8 ffffffff812b8abe ffffffff81703
                   148
                   [   22.824934]  ffff880062c57f48 ffff880062c57f38 ffffffff8114ba58 ffffffff00000
                   018
                   [   22.825042]  ffff880062c57f48 ffff880062c57ee0 0000000000000000 ffff88007ffda
                   4c5
                   [   22.825042] Call Trace:
                   [   22.825042]  [<ffffffff812b8abe>] dump_stack+0x63/0x85
                   [   22.825042]  [<ffffffff8114ba58>] panic+0xc4/0x1fc
                   [   22.825042]  [<ffffffff811c7ee4>] ? putname+0x54/0x60      
                   [   22.825042]  [<ffffffff814e1d90>] ? rest_init+0x80/0x80
                   [   22.825042]  [<ffffffff814e1e19>] kernel_init+0x89/0xe0            
                   [   22.825042]  [<ffffffff814e887f>] ret_from_fork+0x3f/0x70
                   [   22.825042]  [<ffffffff814e1d90>] ? rest_init+0x80/0x80
                   [   22.825042] Kernel Offset: disabled
                   [   22.825042] ---[ end Kernel panic - not syncing: Requested init /nix/store/3z
                   rg47j9ihydsn026qzpwf7fs4ncx8f8-nixos-system-nixos-16.09pre56789.gfedcba/init fai
                   led (error -2).

The error seems different from "init not found", so I don't know what is happening atm.

@nshalman
Copy link
Member Author

I've done some simple testing using the iPXE that is built into QEMU (see nshalman@e850edb) and it appears to work just fine.

@bobvanderlinden please confirm that it works for you as my example demonstrates.

@nshalman
Copy link
Member Author

Squashed down into master...nshalman:netboot-v2

I think this is nearly ready to be turned into a PR...

@nshalman
Copy link
Member Author

Based on some testing, given how much of the nix store needs to live in the initrd anyway, I think that putting the nix store that ends up in the squashfs directly in the initrd will probably be a very effective way of further shrinking the final initrd size.

du -sh /nix/store /mnt-root/nix/store
302.2M  /nix/store
596.7M  /mnt-root/nix/store

@bobvanderlinden
Copy link
Member

bobvanderlinden commented Apr 14, 2016

Yes, that might be a more efficient way. However, it isn't as easy as bind-mounting / to /mnt-root, since switch_root will delete everything from /.

That said, I've added a boot-test for netboot: https://github.com/bobvanderlinden/nixpkgs/tree/netboot-v2
See whether it works and if you'd like to integrate it into your branch.

@nshalman
Copy link
Member Author

nshalman commented Apr 15, 2016

It's actually easier than I thought it would be. The essence of it is in the following change: nshalman@1ced0c9

We sidestep the need for stage1 by having /init be a link to the toplevel/stage2 init. A side effect of that is that the closure for the whole system ends up in the initrd so there's no need for the squashfs.

I've squashed that change into a new branch which also includes @bobvanderlinden's test code (in its own commit for credit purposes 😉) which is here: master...nshalman:netboot-v3

Edit: Sorry for the false closure, accidentally hit the wrong button on the web page.

@nshalman nshalman reopened this Apr 15, 2016
@nshalman
Copy link
Member Author

And it's currently broken.. I've still got some work to do.

@nshalman
Copy link
Member Author

Okay, there's something weird with trying to sidestep the squashfs that's complicated enough that I think it should be separate work, either follow-on for an official netboot, or just something I use downstream and not in the official version at all.

@nshalman
Copy link
Member Author

nshalman commented May 2, 2016

Resolved by #14740

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
9.needs: reporter feedback This issue needs the person who filed it to respond
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants