Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS 19.09 upgrade may change network interface naming #71086

Closed
wlhlm opened this issue Oct 13, 2019 · 17 comments · Fixed by #71456
Closed

NixOS 19.09 upgrade may change network interface naming #71086

wlhlm opened this issue Oct 13, 2019 · 17 comments · Fixed by #71456
Labels
0.kind: bug Something is broken 9.needs: documentation 9.needs: port to stable A PR needs a backport to the stable release.

Comments

@wlhlm
Copy link
Contributor

wlhlm commented Oct 13, 2019

Describe the bug
NixOS 19.09 updates systemd from version 239 to version 243 which comes with a changed network device naming algorithm which can result in hosts coming up with different interface names after upgrading NixOS 19.09. In turn, this means previously set up network configuration may no longer apply and thus the host might loose network connectivity.

To Reproduce
Steps to reproduce the behavior:

  1. Upgrade NixOS to 19.09, for example using the procedure outlined in the manual.
  2. Reboot to apply the OS upgrade
  3. Observe machine enacting unexpected network behavior, for example missing network connectivity, unreachability from the outside, etc.

Of course, step 3 happens depending on whether the new interface naming algorithm decides to generate a new name for the hardware configuration or not.

Expected behavior
I don't have a problem with the interface names changing in itself, just that it should be explicitly considered for the upgrade procedure to NixOS 19.09. For this I can come up with two potential solutions:

  1. Explicitly mention in the release notes that the upgrade to systemd 243 may result in interface names changing and suggest a procedure for administrators to check interface names before rebooting after the upgrade.
  2. Change default systemd configuration to keep using old interface naming algorithm. See section Additional context for more info.

Additional context
Luckily, systemd make changes to the interface naming algorithm explicit and keep old versions around for backwards compatibility. Previous versions are documented in systemd.net-naming-scheme(7) and can be configured with the net-naming-scheme kernel parameter.

Workaround
In case one is affected by the changing network interface name and depends on network connectivity to access a machine (such as a remote server) and doesn't have access to out-of-band management, but has to use a crappy rescue image one can change the interface naming using the kernel command line by net-naming-scheme= in /etc/nixos/configuration.nix:

boot.kernelParams = [ "net.naming-scheme=v239" ];

or by doing a quick'n'dirty edit to the grub config at /boot/grub/grub.cfg:

# ...
menuentry "NisOS - Default" {
  # ...
  linux ... net.naming-scheme=v239
  # ...
}
# ...

v239 switches to the version of the algorithm used by the systemd version included in NixOS 19.03.

Metadata

# nix run nixpkgs.nix-info -c nix-info -m
 - system: `"x86_64-linux"`
 - host os: `Linux 4.19.79, NixOS, 19.09.789.7952807791d (Loris)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3`
 - channels(wlhlm): `""`
 - channels(root): `"nixos-19.09.789.7952807791d, nixpkgs-19.03.173394.147bd882fc6"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Maintainer information:

attribute: nixos.systemd
@wlhlm wlhlm added the 0.kind: bug Something is broken label Oct 13, 2019
@wlhlm
Copy link
Contributor Author

wlhlm commented Oct 13, 2019

In my particular case, the name for an Intel Ethernet controller changed from enp1s0 to eno0:
before upgrade:

[...]
Aug 29 19:02:46 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
Aug 29 19:02:46 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): registered PHC clock
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:22:4d:87:b0:ff
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 eth0: Intel(R) PRO/1000 Network Connection
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 eth0: MAC: 3, PHY: 8, PBA No: FFFFFF-0FF
[...]
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 enp1s0: renamed from eth0
[...]

journal output after the upgrade:

[...]
Oct 13 14:42:08 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
Oct 13 14:42:08 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
Oct 13 14:42:08 kernel: e1000e 0000:01:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
Oct 13 14:42:08 kernel: e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): registered PHC clock
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:22:4d:87:b0:ff
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eth0: Intel(R) PRO/1000 Network Connection
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eth0: MAC: 3, PHY: 8, PBA No: FFFFFF-0FF
[...]
Oct 13 14:42:09 systemd-udevd[409]: Using default interface naming scheme 'v243'.
[...]
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eno0: renamed from eth0
Oct 13 14:42:09 systemd-udevd[409]: eth0: Process '/nix/store/ily14d68xl11cnbbkf9svwnzwsrrnzah-bash-4.4-p23/bin/sh -c 'echo 2 > /proc/sys/net/ipv6/conf/eth0/use_tempaddr'' failed with exit code 1.
[...]

Kernel version was 4.19.79 in both cases.

@markuskowa
Copy link
Member

See also #71082

@vcunat
Copy link
Member

vcunat commented Oct 13, 2019

Let's merge the discussion into a single thread.

@vcunat vcunat closed this as completed Oct 13, 2019
@vcunat vcunat added the 2.status: duplicate This is a duplicate of another issue or PR label Oct 13, 2019
@wlhlm
Copy link
Contributor Author

wlhlm commented Oct 13, 2019

See also #71082

Let's merge the discussion into a single thread.

This mentioned issue seems to based on predictable interface names being gone entirely, which is different to this issue about predictable names being, well... unpredictable.

I don't think this should be closed.

@vcunat
Copy link
Member

vcunat commented Oct 13, 2019

Oh, right, that's weird [the other issue]. I'll reopen until it's clearer. EDIT: and you posted a much better description/analysis anyway :-)

@vcunat vcunat reopened this Oct 13, 2019
@vcunat vcunat removed the 2.status: duplicate This is a duplicate of another issue or PR label Oct 13, 2019
@wlhlm
Copy link
Contributor Author

wlhlm commented Oct 13, 2019

Restating my problem:

What I'm describing isn't a bug in NixOS per se, just that NixOS 19.09 updated from systemd 239 to systemd 243 and from time to time, systemd updates bring slight changes to the way network interfaces are named. This may affect certain hardware configurations resulting in network interface names changing, though in the majority of cases nothing will happen.

What I'm proposing is that it should be made clear in the release notes, that these changes might occur and to suggest a procedure for admins to be able to check before rebooting and adjust network configuration. This would be most important for servers (without out-of-band management), such as the system on which I first discovered this problem.

An alternative solution I propose is to explicitly roll back to the old naming algorithm in the default NixOS configuration to the algorithm from version 239. Although, all this does is delay the inevitable when systemd drops the legacy version of the algorithm, but gives more time to find a solution for a smoother transition.

@vcunat vcunat added 9.needs: documentation 9.needs: port to stable A PR needs a backport to the stable release. labels Oct 13, 2019
@vcunat
Copy link
Member

vcunat commented Oct 13, 2019

I assume we'll go the way of adding a line into the release notes? /cc @disassembler, @lheckemann, @andir.

@wlhlm
Copy link
Contributor Author

wlhlm commented Oct 13, 2019

Notes on checking if interface name will change:

The persistent interface names are documented in systemd.net-naming-scheme(7). You can use udevadm to see the variables from which udev draws to set a "persistent" name:

$ NET_NAMING_SCHEME=v239 udevadm test-builtin net_id /sys/class/net/$IFACE
$ NET_NAMING_SCHEME=v243 udevadm test-builtin net_id /sys/class/net/$IFACE

NET_NAMING_SCHEME can be set to the version of the naming algorithm as listed in the manpage mentioned above. To see how udev draws from the variable, we have to check /run/current-system/sw/lib/systemd/network/99-default.link:

#  ...
[Match]
OriginalName=*

[Link]
NamePolicy=keep kernel database onboard slot path
MACAddressPolicy=persistent

The important setting here is NamePolicy: onboard, slot, and path correspond to the udev variables ID_NET_NAME_ONBOARD, ID_NET_NAME_ONBOARD, and ID_NET_NAME_ONBOARD, ID_NET_NAME_SLOT, ID_NET_NAME_PATH. The first policy that matches is used.

To give an example, here is how the interface on my affected system got changed:

$ NET_NAMING_SCHEME=v239 udevadm test-builtin net_id /sys/class/net/enp1s0
Load module index
Parsed configuration file /nix/store/6snycpaz9zrs5m7xz6dixl1nl0ngdrma-systemd-243/lib/systemd/network/99-default.link
Created link configuration context.
Using interface naming scheme 'v239'.
ID_NET_NAMING_SCHEME=v239
ID_NET_NAME_MAC=enx00224d8741dd
ID_OUI_FROM_DATABASE=MITAC INTERNATIONAL CORP.
ID_NET_NAME_PATH=enp1s0
Unload module index
Unloaded link configuration context.
$ NET_NAMING_SCHEME=v243 udevadm test-builtin net_id /sys/class/net/enp1s0
Load module index
Parsed configuration file /nix/store/6snycpaz9zrs5m7xz6dixl1nl0ngdrma-systemd-243/lib/systemd/network/99-default.link
Created link configuration context.
Using interface naming scheme 'v243'.
ID_NET_NAMING_SCHEME=v243
ID_NET_NAME_MAC=enx00224d8741dd
ID_OUI_FROM_DATABASE=MITAC INTERNATIONAL CORP.
ID_NET_NAME_ONBOARD=eno0
ID_NET_NAME_PATH=enp1s0
Unload module index
Unloaded link configuration context.

You can see that ID_NET_NAME_PATH stayed the same, but v243 added ID_NET_NAME_ONBOARD and looking at /run/current-system/sw/lib/systemd/network/99-default.link:

NamePolicy=keep kernel database onboard slot path

we can see that onboard is listed before path meaning ID_NET_NAME_ONBOARD is chosen before ID_NET_NAME_PATH.

@wlhlm
Copy link
Contributor Author

wlhlm commented Oct 13, 2019

The problem with the procedure outlined above is, that only it works since systemd 240, meaning you'd first have to upgrade systemd from 239 in order to see if interface names change and in turn meaning you'd have to upgrade to NixOS 19.09 using nixos-rebuild switch which is not the best idea for distro upgrades (@vcunat agrees). I'm not sure what the solution here is.

@vcunat
Copy link
Member

vcunat commented Oct 14, 2019

Well, if we really cared about it, we could theoretically have one release (19.09) with forcing the previous naming by default. The main problem I see with that: it's relatively late, so I'm afraid changing back may cause also issues to some people. A compromise approach could be to make it easily configurable and suggest setting the older scheme manually – even if just for one boot to do this procedure.

@michaelpj
Copy link
Contributor

If they keep the old scheme around indefinitely, can't we just set the old one conditional on stateVersion?

@flokli
Copy link
Contributor

flokli commented Oct 14, 2019

@michaelpj People already might have switched their system and dealt with the changes, so applying any changes to 19.09 in that regard will change their system behaviour again.

I think the proper way to address this should be to add this more prominent to the 19.09 release notes, and make sure major systemd changes are mentioned. I'm not talking about copying in all of their release notes, but adding pointers to possibly more invasive changes.

@vcunat
Copy link
Member

vcunat commented Oct 20, 2019

What about this formulation? #71456

worldofpeace pushed a commit to vcunat/nixpkgs that referenced this issue Oct 21, 2019
worldofpeace pushed a commit that referenced this issue Oct 21, 2019
@wlhlm
Copy link
Contributor Author

wlhlm commented Oct 21, 2019

What about this formulation? #71456

I'm fine with that. Thank you.

peti pushed a commit that referenced this issue Oct 21, 2019
@nh2
Copy link
Contributor

nh2 commented Apr 10, 2020

For the record, we got bitten by another interface rename (investigation kindly helped by @flokli):

https://gist.github.com/nh2/71854c40a1a1a7c15bc8a8105e854f88

We found that our older 4.14.89 kernel classified 1 of the 2 NICs of our HP server as ONBOARD, but the newer 5.4.27 kernel classified both as ONBOARD, resulting in the interface enp2s0 newly being named eno0).

@elmarsto
Copy link

This issue (or one like it) just recurred for me, after a kernel version bump. It appears that enp5s0 became eth0 with no warning.

Suggested fix:

  • Allow referring to network interfaces (in configuration.nix) by MAC address. Something along the lines of:
networking.interfacesByMac."AA:BB:CC:DD:EE:FF:00" = {
 ipv4.addresses = [....]
}

would also need to be able to refer by MAC address in e.g. networking.firewall.interfaces, and everywhere else (networkmanager.unmanaged, etc.)

This would be substantially more stable for config, given how mutable linux kernel ethernet interface naming scheme has become

@flokli
Copy link
Contributor

flokli commented Sep 11, 2021

The way to do this is documented in nixos/doc/manual/configuration/renaming-interfaces.xml - You can either use networking.usePredictableInterfaceNames = false, create something in systemd.network.links, or use a plain udev rule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 9.needs: documentation 9.needs: port to stable A PR needs a backport to the stable release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants