Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binfmt / interop fixes #64

Merged
merged 2 commits into from
Mar 30, 2022
Merged

binfmt / interop fixes #64

merged 2 commits into from
Mar 30, 2022

Conversation

K900
Copy link
Contributor

@K900 K900 commented Mar 26, 2022

Not really sure what a better way to describe the changes is. This makes boot.binfmt work, and re-registers WSL stuff through it so it's easier to mess with through NixOS configs.

We need to do this so systemd-binfmt works
@nzbr nzbr mentioned this pull request Mar 26, 2022
@nzbr nzbr added the bug Something isn't working label Mar 26, 2022
@K900
Copy link
Contributor Author

K900 commented Mar 26, 2022

Fixed formatting.

Copy link
Member

@nzbr nzbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really interesting! I have tried to remount binfmt_misc through the fileSystems option, but that did not work.
I suggest a separate option for overwriting the WSLInterop registration.
wsl.interop.enable and wsl.interop.includePath should also set wsl.wslConf.interop.enabled and wsl.wslConf.interop.appendWindowsPath respectively.

Edit: removed formatting hint

modules/wsl-distro.nix Outdated Show resolved Hide resolved
@nzbr
Copy link
Member

nzbr commented Mar 26, 2022

I just did a quick test: The binfmt_misc state is global to the whole WSL2 VM. If WSLInterop is re-registered in NixOS, the contents of /proc/sys/fs/binfmt_misc/WSLInterop change in other distros as well. Starting a cmd.exe did still work, however I don't like that the default behavior of the NixOS config potentially affects other distros

@K900
Copy link
Contributor Author

K900 commented Mar 26, 2022

I'm not really sure what we can do here. How does that interact with wsl.conf?

@K900
Copy link
Contributor Author

K900 commented Mar 26, 2022

OK, so after thinking about this a bit more here's the possibilities:

wsl.conf interop=true systemd-binfmt enabled interop registered in binfmt.d result
N N N works as expected
N N Y (invalid)
N Y N works as expected
N Y Y works as expected
Y N N works as expected
Y N Y (invalid)
Y Y N removes existing binfmt registration
Y Y Y works as expected

So I guess the best solution here is actually to force interop=false in wsl.conf, and handle it entirely in binfmt.d

@K900
Copy link
Contributor Author

K900 commented Mar 26, 2022

Oh wait, we actually can't force interop=false and have $WSLPATH still work, we need interop=true for that, so I guess we have to actually force interop=true, then force-enable systemd-binfmt to remove the registration if necessary.

@nzbr
Copy link
Member

nzbr commented Mar 26, 2022

I'd say we just rename wsl.interop.enable to wsl.interop.enableBinfmt (or something similar) and make it disabled by default. So that we don't touch the binfmt registrations unless that is desired. The rename is so that it is obvious, that includePath also works without it.

When includePath is enabled, interop.enable and interop.appendWindowsPath are set to true in wsl.conf

When wsl.interop.enable (however it may be named then) is true, WSLInterop gets re-registered

Additionally, there should be an assertion, that prevents the configuration from building, if there is anything else in boot.binfmt, but wsl.interop.enable is false

@nzbr
Copy link
Member

nzbr commented Mar 26, 2022

That assertion looks like this:

  assertions = [
    (mkIf (!config.wsl.interop.enable) {
      assertion = length (attrValues config.boot.binfmt.registrations) == 0;
      message = "wsl.interop.enable must be true to use binfmt_misc!";
    })
  ];

@K900
Copy link
Contributor Author

K900 commented Mar 27, 2022

I'd say we just rename wsl.interop.enable to wsl.interop.enableBinfmt (or something similar) and make it disabled by default. So that we don't touch the binfmt registrations unless that is desired. The rename is so that it is obvious, that includePath also works without it.

I don't like this, because it causes some spooky action at a distance - enabling any other binfmt registration will break this setup.

Additionally, there should be an assertion, that prevents the configuration from building, if there is anything else in boot.binfmt, but wsl.interop.enable is false

The thing is, this is a valid config. This is something you may actually want to run. That's why I want to figure something out to make it work.

@nzbr
Copy link
Member

nzbr commented Mar 27, 2022

The assertion would just prevent you from having any other binfmt registration without also enabling the WSLInterop one. So that you can't accidentaly remove the WSLInterop registration

@K900
Copy link
Contributor Author

K900 commented Mar 27, 2022

But you might want to have other registrations and not WSLInterop.

@nzbr
Copy link
Member

nzbr commented Mar 27, 2022

How about a warning instead? Something like "Having binfmt registrations without also enabling wsl.interop.enable may break interoperability across all WSL distributions"

@K900
Copy link
Contributor Author

K900 commented Mar 27, 2022

Another idea: we can add a third option, wsl.interop.canTouchBinfmt or whatever, similarly to boot.loader.efi.canTouchEfiVariables, then add an assertion for the !canTouchBinfmt && binfmt.registrations > 0 case. Then, if canTouchBinfmt is off, we keep whatever state already exists, and if it's on, we can enable systemd-binfmt etc.

@nzbr
Copy link
Member

nzbr commented Mar 27, 2022

Sounds like a good idea! I'd still suggest renaming wsl.interop.enable to wsl.interop.register so that it is more clear, that it's not required to be able to use interop. When canTouchBinfmt is false, this should just be ignored, so that we can default to canTouchBinfmt = false; and register = true;

@axgfn
Copy link
Contributor

axgfn commented Mar 27, 2022

It may be helpful to read the discussion on genie's issue tracker where they dealt with this issue. I had a hard time parsing out what solution they ultimately settled on, but it's pretty informative nonetheless.

@K900
Copy link
Contributor Author

K900 commented Mar 27, 2022

The question is less "how do we fix it" and more "how do we make it work in an obvious way".

@K900
Copy link
Contributor Author

K900 commented Mar 28, 2022

OK, the more I think about this, the less I like it. It creates some really awkward should-be-impossible states that we need lots of assertions to guard against. I feel like it might actually be a good idea to just take over the WSL settings here.

@sielicki
Copy link

sielicki commented Mar 28, 2022

I just did a quick test: The binfmt_misc state is global to the whole WSL2 VM. If WSLInterop is re-registered in NixOS, the contents of /proc/sys/fs/binfmt_misc/WSLInterop change in other distros as well. Starting a cmd.exe did still work, however I don't like that the default behavior of the NixOS config potentially affects other distros

IMO that's their bug, not ours. Especially when there's a global .wslconfig and the disto-specific (distro-managed?!) wsl.conf. If we somehow break other distros from local modifications to a config file in our filesystem, we shouldn't feel responsible. It's their job to protect other distros from what we do.

In practice, I agree this is something we can't do, but maybe we should file a bug upstream and see what they would like us to do instead.

@nzbr
Copy link
Member

nzbr commented Mar 28, 2022

I don't actually think that this is a bug that they can fix. binfmt_misc registrations are a feature of the linux kernel and are not specific to a namespace. I'm not sure, what setting interop.enable to false in wsl.conf does if it is set to true in another distro. It's a setting we probably just should not touch at all (it's enabled by default anyway). The problem here is, that in order to add additional registrations (like qemu for executing binaries from foreign architectures) NixOS uses systemd-binfmt, which will remove the WSLInterop registration if it is not specified in /etc/binfmt.d.

So the idea is that we don't want to have a state, where a user creates a binfmt registration without re-registering WSLInterop, which would break interop for the whole VM (unless that is what they want).
I just don't want re-registering WSLInterop to be the default behavior due to potential side effects for other distros

The resulting possible states would be:

  1. binfmt does not get touched at all, interop works (default and the current behavior on main)
  2. WSLInterop gets re-registered, but there are no other registrations. Interop works (the current behavior of this PR)
  3. There are other binfmt registrations, but WSLInterop does not get re-registered. Breaks interop everywhere
  4. There are other binfmt registrations and WSLInterop gets re-registered. Interop works

The only one of these states that requires an additional safeguard (to make sure that this is what the user intended) is 3.

This turned out a little long, sorry xD. I just want to make sure that I'm on the same page as y'all

@sielicki
Copy link

sielicki commented Mar 28, 2022

https://patchwork.kernel.org/project/linux-fsdevel/cover/[email protected]/

Seems possible that they could let each distro have its own binfmt registrations. I think we should make an issue and get their input. It would be a lot cleaner if we didn't have to worry about point 3.

Edit: I don't believe this patchset got picked up. Bummer.

@nzbr
Copy link
Member

nzbr commented Mar 28, 2022

Opening that issue probably wouldn't hurt, though. The patchset proves that namespacing binfmt_misc is possible with modified kernels. And with WSL using a custom kernel anyway that's probably something that Microsoft could implement if they wanted to

@sielicki
Copy link

sielicki commented Mar 28, 2022

https://patchwork.kernel.org/project/linux-fsdevel/patch/[email protected]/

edit: wrong link, correct one is here: https://lore.kernel.org/all/[email protected]/

I emailed the author of the patchset I linked to previously and he was able to point me towards the above link for the current approach.

it seems that this will eventually land and, assuming we get our own binfmt namespace, we’ll be able to just use binfmt normally as you would on normal nixos. In that case, I’d get behind just ignoring the breakage of other distros and merging this as-is, because it will resolve itself when Microsoft releases a new kernel that supports this and makes whatever other changes are required to set the namespace up for us.

We should open or find a ticket to attach this to, to let them know.

@K900
Copy link
Contributor Author

K900 commented Mar 30, 2022

I'd also argue for merging this as-is in this case.

@nzbr
Copy link
Member

nzbr commented Mar 30, 2022

Can you still rename enable to register? It reads as though it is required for interop to work and disabling it would disable interop, which isn't true (currently). The current name and description might confuse some users. When you change the name, i'll merge this asap

@K900
Copy link
Contributor Author

K900 commented Mar 30, 2022

Sure. Done.

@nzbr
Copy link
Member

nzbr commented Mar 30, 2022

You forgot one in modules/wsl-distro.nix:67

- add an option to disable interop entirely
- add an option to include Windows %PATH% in WSL $PATH
- register interop through systemd-binfmt
@K900
Copy link
Contributor Author

K900 commented Mar 30, 2022

That I did. Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants