-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qubes-network-uplink.service randomly not initialized on cloned standalones #7284
qubes-network-uplink.service randomly not initialized on cloned standalones #7284
Comments
which kernel version you use? |
@marmarek |
I also have started experiencing this exact same issue after a recent update. I am with kernel For me, running |
@na-- yeah thanks. will try to play with different kernel versions. |
This might be caused by a change in systemd 250, which Arch Linux uses but not Debian or Fedora stable, in the naming of network interfaces. |
Indeed, that's possible. Theoretically startup script look for the interface via MAC address, but there is a fallback to |
What are the logs of the |
New name is enX0 btw |
@marmarek, is
and this is the result of
|
Same problem for me with an ArchLinux HVM and an ArchLinux AppVM. Below for the HVM:
with the below kernel
In the next post I'll give a start of the analysis, but first here a workaround (of course, you should customize $GW): [user@archlinux ~]$ cat /rw/config/rc.local
#! /bin/bash
# This script will be executed at every VM startup, you can place your own
# custom commands here. This includes overriding some configuration in /etc,
# starting services etc.
# Example for overriding the whole CUPS configuration:
# rm -rf /etc/cups
# ln -s /rw/config/cups /etc/cups
# systemctl --no-block restart cups
LOG=/tmp/rc-local-$(date -I).log
GW="10.138.3.247"
ip -4 -br a | tee -a $LOG
CT=0
while [ "$CT" -lt "5" ]; do
ping -c 1 $GW && break
echo "No ping #$CT, restart uplink" | tee -a $LOG
systemctl restart [email protected]
sleep 2
$(( CT++ ))
done
ip -4 -br a | tee -a $LOG The result log is
|
How does Qubes-OS choose the network interface ? Net interfaces status with the uplink pb: [user@archlinux ~]$ ip -o -br link
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
enX0 DOWN 00:16:3e:5e:6c:00 <BROADCAST,MULTICAST>
[user@archlinux ~]$ ip -4 -br a
lo UNKNOWN 127.0.0.1/8 Systemd status: [user@archlinux ~]$ sudo systemctl list-units --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● qubes-network-uplink.service loaded failed failed Qubes network uplink wait
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
1 loaded units listed.
Above the uplink service choose the wrong net interface (eth0, it should be enX0). How does the uplink service choose the net interface? [user@archlinux ~]$ sudo systemctl cat qubes-network-uplink.service
# /usr/lib/systemd/system/qubes-network-uplink.service
[Unit]
Description=Qubes network uplink wait
Before=network.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/lib/qubes/init/network-uplink-wait.sh
[Install]
WantedBy=multi-user.target
The above service launches the uplink sevice for the found interface (eth0 or enX0) from the How does the [user@archlinux ~]$ grep -B9 -A16 get_qubes_managed_iface /usr/lib/qubes/init/functions
get_iface_from_mac() {
local mac="$1"
local iface=
if [ "x$mac" != "x" ]; then
iface="$(ip -o link | grep -i "$mac" | awk '{print $2}' | cut -d ':' -f1)"
fi
echo "$iface"
}
get_qubes_managed_iface() {
local mac
local qubes_iface
mac="$(qubesdb-read /qubes-mac 2> /dev/null)"
if [ -z "$mac" ]; then
# no qubes-managed network interface
return
fi
# Load the module explicitly here, to avoid waiting for udev doing that
[ -e /sys/module/xen_netfront ] || modprobe xen-netfront || :
qubes_iface="$(get_iface_from_mac "$mac")"
if [ "x$qubes_iface" != "x" ]; then
echo "$qubes_iface"
elif [ -e /sys/class/net/eth0 ]; then
echo eth0
fi
} So It finds the net interface from But what was the |
I added traces to understand the net interface choice, qlog() {
echo "$(date +%s.%N) get_qubes_managed_iface : $*" >> /tmp/f.log || :
}
get_qubes_managed_iface() {
local mac
local qubes_iface
mac="$(qubesdb-read /qubes-mac 2> /dev/null)"
if [ -z "$mac" ]; then
# no qubes-managed network interface
return
fi
qlog "mac from qubesdb is $mac"
# Load the module explicitly here, to avoid waiting for udev doing that
qlog "is xen_netfront loaded ? "
ls -d /sys/module/xen* >> /tmp/f.log
[ -e /sys/module/xen_netfront ] && qlog "xen-netfront already loaded"
[ -e /sys/module/xen_netfront ] || qlog "xen-netfront not yet loaded"
[ -e /sys/module/xen_netfront ] || modprobe xen-netfront || :
qlog "/sys/class/net interfaces"
ls /sys/class/net/ >> /tmp/f.log
qlog "ip link interfaces"
ip -o -br link >> /tmp/f.log
qlog "ip addr interfaces"
ip -4 -br a >> /tmp/f.log
qubes_iface="$(get_iface_from_mac "$mac")"
if [ "x$qubes_iface" != "x" ]; then
qlog "iface name from qubes_iface ($qubes_iface)"
echo "$qubes_iface"
elif [ -e /sys/class/net/enX0 ]; then
qlog "iface name from /sys/classi/net/enX0"
echo enX0
elif [ -e /sys/class/net/eth0 ]; then
qlog "iface name from /sys/class/net/eth0"
echo eth0
fi
qlog "exit"
}
The traces are in Case 1 : the xen module is not yet loaded, the function loads it but gets the not yet renamed interface (so eth0). (note: I commented the
Case 2 : the xen module loaded but the net interface not yet renamed to enX0
Case 3 : the xen module loaded and the net interface already renamed to enX0
Note: my traces slow the So it's a race problem as @na-- said. I see these ideas for solving the problem:
Do you agree with this analysis? Do you see another ideas for solving the problem? Do you see how to implement theses ideas? |
That is the only choice that makes sense, but what should it be |
I tested
which didn't solve the issue. For I continued to search in the systemd man pages and the qubes commit history. I found the qubes-core-agent-linux dd8de79 commit which set the current uplink services and udev rules but also explains how it works (see: pulled in by udev based on vif device existence). So a new idea: I will test |
The issue is that Here are the
I also tried changing the udev rules to Unless there's a benefit of using
|
necessary for #7342 too |
There are a couple of issues with this renaming: 1. When enabled, the interface name cannot be prediced until it actually happens. This breaks waiting for the device to appear in qubes-network-uplink.service. 2. Setting SYSTEMD_WANTS on a device that gets renamed seems to not work (is the variable bound to the old device name?). This breaks dynamic network attach (see 99-qubes-network.rules). So, disable it completely for Xen devices, at least for now. This may pose some issues (or rather - rollback fix attempt) for VMs with both physical devices and Xen netfront device(s), but this is extremely rare case that nobody complained about before. Fixes QubesOS/qubes-issues#7284
Automated announcement from builder-github The package
|
Automated announcement from builder-github The package
|
Automated announcement from builder-github The package
|
I've encountered this problem on Qubes R4.2, Standalone based on Debian 12 Minimal.
Running as suggest above:
Fixed the issue after restart. Can this issue be reopened please? |
How to file a helpful issue
Qubes OS release
4.1
but interface
enX0
has physically down stateusually 1-2 vm reboots is helpful
The text was updated successfully, but these errors were encountered: