Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LAN port disappears after reboot on NanoPi R4S #6342

Closed
idaanx opened this issue Apr 26, 2023 · 23 comments
Closed

LAN port disappears after reboot on NanoPi R4S #6342

idaanx opened this issue Apr 26, 2023 · 23 comments
Labels
External bug 🐞 For bugs which are not caused by DietPi. Kernel related 🧬 NanoPi R4S Solution available 🥂 Definite solution has been done
Milestone

Comments

@idaanx
Copy link

idaanx commented Apr 26, 2023

Creating a bug report/issue

Required Information

  • DietPi version | 8.16.2 master
  • Distro version | bullseye
  • Kernel version | Linux 5.15.93-rockchip64 #23.02.2 SMP PREEMPT Fri Feb 17 23:48:36 UTC 2023 aarch64 GNU/Linux
  • SBC model | NanoPi R4S (aarch64)
  • Power supply used | 5V 2.4A Anker
  • SD card used | 16GB class 6 (also tested on SanDisk Ultra 128 GB)

Steps to reproduce

  1. ssh root@dietpi
  2. reboot

Expected behaviour

  • NanoPi reboots
  • PWR: red LED on
  • SYS: green LED heartbeat
  • LAN: green LED turns on = eth0 active
  • WAN: green LED turns on = eth1 active

Actual behaviour

  • NanoPi reboots
  • PWR: red LED on
  • SYS: green LED heartbeat
  • LAN: LED stays off = eth0 inactive
  • WAN: LED stays off = eth1 inactive

A power reset gets it back running again.

Extra details

This SBC doesn't have a HDMI-out but I've got in after rebooting through a serial connection and I've run the ip a command.

Only eth0 is shown but has the MAC address of eth1/WAN.

2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether ##:##:##:##:14:c4 brd ff:ff:ff:ff:ff:ff

This is how it looks on a normal boot.

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether ##:##:##:##:6a:6c brd ff:ff:ff:ff:ff:ff
    altname enp1s0
    inet 192.168.1.133/24 brd 192.168.1.255 scope global dynamic eth0
       valid_lft 43134sec preferred_lft 43134sec
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether ##:##:##:##:14:c4 brd ff:ff:ff:ff:ff:ff

My R4S should has a special chip which has the 2 MAC addresses stored in it, there is also a version without this see https://wiki.friendlyelec.com/wiki/index.php/NanoPi_R4S#Differences_Between_R4S_Standard_Version_.26_R4S_Enterprise_Version. Not sure if this might have something to do with this issue.

A cold boot also does not get an IP when connecting via the WAN port, not sure if that is as intended when not using the device as a router. Using the LAN port, rebooting and quickly switching the connection to the WAN port get it connected again, although this is not very useful.

@MichaIng
Copy link
Owner

MichaIng commented Apr 26, 2023

I can confirm this. I just booted by NanoPi R4S and it shows:

2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 7e:80:d1:05:fb:65 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 80:1f:12:fc:75:48 brd ff:ff:ff:ff:ff:ff

Then I did an APT upgrade:

bash/testing 5.2.15-2+b2 arm64 [upgradable from: 5.2.15-2+b1]
curl/testing 7.88.1-9 arm64 [upgradable from: 7.88.1-8]
libcurl4/testing 7.88.1-9 arm64 [upgradable from: 7.88.1-8]
tzdata/testing 2023c-3 all [upgradable from: 2023c-2]

And a DietPi update, and after reboot, eth1 is fone, actually eth0 is gone and hence eth1 became eth0.

I first thought the change in /etc/udev/rules.d/dietpi-eth-leds.rules (in dev branch) may be responsible, but even removing it completely does not solve it. Also you are on master branch, I assume? Also here downgrading to master does not solve it.

So if the DietPi update is not responsible, only left are the 4 APT packages, but those look all so completely unrelated.
EDIT: Downgrading all those packages does not help.

EDIT2: Now testing power cycle: Indeed both Ethernet devices are there again. Comparing kernel errors:

root@NanoPiR4S:~# dmesg -l 0,1,2,3
[    3.155277] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed

After reboot:

root@NanoPiR4S:~# dmesg -l 0,1,2,3
[    2.368851] mmc1: tuning execution failed: -5
[    2.369250] mmc1: error -5 whilst initialising SD card
[    2.502565] mmc1: tuning execution failed: -5
[    2.657043] rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!
[    3.551691] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed

Interestingly no Ethernet related error added, but it boots differently after soft reset.
Okay so the DietPi and package upgrades are unrelated but this seems to be an issue with the bootloader, not resetting the SBC properly on reboot. Interesting that this has been never observed before.

@MichaIng MichaIng added Kernel related 🧬 External bug 🐞 For bugs which are not caused by DietPi. labels Apr 26, 2023
@MichaIng
Copy link
Owner

Okay a kernel issue: Upgrading to "edge" solves it:

apt install linux-{image,dtb}-edge-rockchip64

Should become "current" soon.

@MichaIng MichaIng added the Workaround available 🆗 Workaround is available/has been implemented, but a definite solution should be found when possible. label Apr 26, 2023
@idaanx
Copy link
Author

idaanx commented Apr 26, 2023

apt install linux-{image,dtb}-edge-rockchip64

This did not fix it for me, still the same. This was on a fresh Dietpi install and here's my output for the kernel update

root@DietPi:~# apt install linux-{image,dtb}-edge-rockchip64
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  linux-dtb-edge-rockchip64 linux-image-edge-rockchip64
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 53.4 MB of archives.
After this operation, 97.2 MB of additional disk space will be used.
Get:1 https://xogium.performanceservers.nl/apt bullseye/main arm64 linux-dtb-edge-rockchip64 arm64 23.02.2 [305 kB]
Get:2 https://armbian.hosthatch.com/apt bullseye/main arm64 linux-image-edge-rockchip64 arm64 23.02.2 [53.1 MB]
Fetched 53.4 MB in 2s (26.4 MB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package linux-dtb-edge-rockchip64.
(Reading database ... 17730 files and directories currently installed.)
Preparing to unpack .../linux-dtb-edge-rockchip64_23.02.2_arm64.deb ...
Unpacking linux-dtb-edge-rockchip64 (23.02.2) ...
Selecting previously unselected package linux-image-edge-rockchip64.
Preparing to unpack .../linux-image-edge-rockchip64_23.02.2_arm64.deb ...
Unpacking linux-image-edge-rockchip64 (23.02.2) ...
Setting up linux-dtb-edge-rockchip64 (23.02.2) ...
Setting up linux-image-edge-rockchip64 (23.02.2) ...
Removing obsolete initramfs images
update-initramfs: Generating /boot/initrd.img-6.1.11-rockchip64
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8125a-3.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8107e-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168fp-3.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168g-3.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168g-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8106e-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8106e-1.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8411-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8411-1.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8402-1.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168f-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168f-1.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8105e-1.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168e-3.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168e-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168e-1.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168d-2.fw for module r8169
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8168d-1.fw for module r8169
update-initramfs: Converting to U-Boot format
Image Name:   uInitrd
Created:      Wed Apr 26 23:02:13 2023
Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
Data Size:    10677838 Bytes = 10427.58 KiB = 10.18 MiB
Load Address: 00000000
Entry Point:  00000000
'/boot/uInitrd' -> 'uInitrd-6.1.11-rockchip64'

@MichaIng
Copy link
Owner

MichaIng commented Apr 26, 2023

Strange, it worked for me for quite a bunch of reboots. I turned off the device for a while, not back on and indeed with edge kernel it same issue now. I also see again the same kernel errors. No idea how it is possible that it went well the whole session after the kernel upgrade (the kernel errors were also gone after reboot).

EDIT: After playing a little around, it now works fine again, both Ethernet devices there after (soft) reboot. As if the temperature or something like that had an influence. CPU temperature was now at 33 °C.

@MichaIng MichaIng removed the Workaround available 🆗 Workaround is available/has been implemented, but a definite solution should be found when possible. label Apr 26, 2023
@MichaIng MichaIng modified the milestones: v8.17, v8.18 May 6, 2023
@MichaIng MichaIng modified the milestones: v8.18, v8.19 Jun 3, 2023
@MichaIng MichaIng modified the milestones: v8.19, v8.20 Jul 2, 2023
@3735943886
Copy link

It appears that the issue is related to Armbian kernel bug, which has been reported over a year ago but remains unresolved. For reference, you can find the details of the problem at https://armbian.atlassian.net/browse/AR-1176

@MichaIng MichaIng modified the milestones: v8.20, v8.21 Jul 29, 2023
@MichaIng MichaIng modified the milestones: v8.21, v8.22 Aug 26, 2023
@MichaIng
Copy link
Owner

The dietpi-update will apply a major kernel upgrade for this SBC. But as of the open Armbian issue, likely it won't fix this issue. Does the suggested solution work?

apt install armbian-firmware-full

It installs additional firmware, and it seems one of them is required for one of the R4S onboard NICs.

@idaanx
Copy link
Author

idaanx commented Aug 29, 2023

The v8.21.1 update did NOT fix the main issue of this thread, nor did installing the full firmware. But I've encountered some other issues related to the update.

The first issue is probably somehow related. As mentioned before my model has a chip with fixed MAC addresses for the 2 NICs, but since the update to v8.21.1 I've got a new random MAC for eth0 (eth1 has the correct MAC) after each power cycle.

The other issue is related to the 4 LEDs on the device, the settings for LED control only shows 3 LEDs, LAN, WAN and Power and are set to none by default. This results in solid red light for Power, solid green light for the SYS and LAN/WAN always off. On previous versions this was solid red light for Power, heartbeat for SYS and LAN/WAN was network throughput.

I have encountered this on my main installation AND on a fresh SD card with v8.16.2, which updated itself on the first login to v8.21.1. The first boot on v8.16.2 did NOT have the issue with the LEDs nor the random MAC address.

@MichaIng
Copy link
Owner

Hmm, I did not recognise the broken LAN/WAN LEDs. Will have a look into it. Probably the nodes changes their names in /sys/class/leds.

There is the ethaddr and/or eth1addr kernel command-line parameter you could add via /boot/dietpiEnv.txt, but not sure if those work, as they seem to be no official Linux parameters.

There are also ways to set it in userland via /etc/network/interfaces or ip command. See here: #6565

MichaIng added a commit that referenced this issue Aug 30, 2023
- NanoPi R4S | Resolved a v8.20 regression where the Ethernet LEDs did not react correctly after the kernel upgrade. Many thanks to @idaanx for reporting this issue: #6342 (comment)
@MichaIng
Copy link
Owner

Okay, the LED node names indeed have changed. I fixed the WAN and LAN LEDs: bdde5bd

The 3rd one is falsely labelled. It is named as if it was the red PWR LED, but it actually toggles the green SYS LED, while the PWR LED has no node anymore. However, you can set the SYS LED back to heartbeat or activity or so. Not sure if we should override the kernel default?

When I find time I'll see if I find a device tree patch to fix the SYS LED label and re-add a PWR LED node (if it was ever there?).


About the Ethernet port disappearing: It still works find here with a warmed up device. I'll again power it off for a while and see if I can again reproduce it after it cooled down, like previously.

What I forgot about the firmware fix, you likely need to rebuild the initramfs as well for it to take effect:

update-initramfs -u
reboot

@MichaIng
Copy link
Owner

MichaIng commented Aug 30, 2023

Okay, installing the full firmware and updating the initramfs indeed does not solve it. I guess Igor just fall into the same trap that it does work well after the SBC has heated up a little. Kernel errors have changed a little with the new kernel. Aside of some unrelated thermal zone errors:

[    2.264993] mmc1: tuning execution failed: -5
[    2.265437] mmc1: error -5 whilst initialising SD card
[    2.399187] mmc1: tuning execution failed: -5
[    2.549841] rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!

The Ethernet error has gone. One reboot earlier I additionally got:

[    1.762179] phy phy-ff7c0000.phy.8: phy poweron failed --> -110
[    1.762847] dwc3 fe800000.usb: error -ETIMEDOUT: failed to initialize core

After doing a benchmark and a minute stress test, oh and with overclocking overlay (which does now work!) to 2 GHz, it now works, with a steady 33 °C CPU temperature. It takes a while until the case has heated up as well to keep this steady. And again all above error messages are gone as well. What a strange hardware 😄.

@idaanx
Can you actually replicate that heating up the device solves the issue? I can however imagine that it is very board specific which temperature is actually needed. Probably some resistor, which is involved in worm reboot timings, unintentionally reacts a little too much on temperatures.

@MichaIng MichaIng modified the milestones: v8.22, v8.23 Sep 23, 2023
@MichaIng MichaIng modified the milestones: v8.23, v8.24 Oct 21, 2023
@MichaIng MichaIng modified the milestones: v8.24, v8.25 Nov 19, 2023
@MichaIng MichaIng modified the milestones: v8.25, v9.0 Dec 20, 2023
@idaanx
Copy link
Author

idaanx commented Jan 19, 2024

I haven't used the Nanopi R4S for a while because of the issues. The last few weeks I've tried again to see if things have changed, using various distros / kernels.

Dietpi 8.25.1 (kernel 6.1.63) still the same issues

  • LAN interface disappearing on reboot and new random mac address (on cold boots).
  • WAN interface not working / no connection.

Armbian 32.11.2 (kernel 6.1.68 or 6.6.2)

  • LAN interface works after reboot, different mac address but the same every time.
  • WAN works and also has the correct mac address.

Armbian 32.11.2 (kernel 5.15.93)

  • LAN interface works after reboot.
  • both interfaces have the correct mac address.

FriendARM Debian Bullseye Core (kernel 6.1.53)

  • LAN interface works after reboot.
  • both interfaces have the correct mac address but are flipped.

It works in Armbian, shouldn't it work in Dietpi too? Maybe it's not a kernel issue but the assignment of the mac addresses that goes wrong. Like I said before, my model has a chip with hardcoded mac addresses see standard vs. enterprise version. Which one do you have to test?

Hope this helps solving this issue, if you need me to test something let me know.

@MichaIng
Copy link
Owner

Probably it has been fixed recently between Linux 6.1.63 and 6.1.68. Armbian does not upload new kernel releases to the APT repo quickly.

I'm currently building the latest kernel package: https://github.com/MichaIng/DietPi/actions/runs/7586686466
They can be found here, once done, would be great if you could test it: https://dietpi.com/downloads/binaries/testing/
I'll also test it later.

As of the MAC address: This might be unrelated. One thing to test: Update the bootloader via dietpi-config advanced options. And did you try to add ethaddr=xx:xx:xx:xx:xx:xx to /boot/dietpiEnv.txt? However, if it is not in /boot/armbianEnv.txt, there might be another reason. Possible is also that NetworkManager sets/overrides it. However, let's see furst whether the new kernel solves it already.

@MichaIng MichaIng modified the milestones: v9.0, v9.1 Jan 20, 2024
@idaanx
Copy link
Author

idaanx commented Jan 22, 2024

The new kernel (6.6.12) has fixed the issue of eth0 disappearing. The mac address is still randomised and adding it to the dietpieEnv.txt file didn't work. What worked was adding it to the /etc/network/interfaces file, not sure if this will survive an update or how to add it to /etc/network/interfaces.d. Maybe an option to add it to the dietpi-config?

Also the WAN/eth1 still does not work, how can I get this working?

@MichaIng
Copy link
Owner

Oh, great to hear. I wasn't aware that the "current" kernel is Linux 6.6 already. Probably we should install this kernel on all R4S systems with next DietPi version. Just tested it here. Sadly the RTL8811CU driver changed and seems to have issues now. It still works, but SSH connection seems choppy:

[    9.169459] rtw_8821cu 7-1:1.0: failed to download firmware
[    9.174334] rtw_8821cu 7-1:1.0: leave idle state failed
[    9.192846] rtw_8821cu 7-1:1.0: failed to leave ips state
[    9.193388] rtw_8821cu 7-1:1.0: failed to leave idle state
[   19.167717] rtw_8821cu 7-1:1.0: failed to download firmware
[   19.177081] rtw_8821cu 7-1:1.0: leave idle state failed
[   19.189835] rtw_8821cu 7-1:1.0: failed to send h2c command
[   19.223586] rtw_8821cu 7-1:1.0: failed to leave ips state
[   19.224191] rtw_8821cu 7-1:1.0: failed to leave idle state
[   29.607875] rtw_8821cu 7-1:1.0: failed to download firmware
[   29.617212] rtw_8821cu 7-1:1.0: leave idle state failed
[   29.629962] rtw_8821cu 7-1:1.0: failed to send h2c command
[   29.666301] rtw_8821cu 7-1:1.0: failed to leave ips state
[   29.666865] rtw_8821cu 7-1:1.0: failed to leave idle state
[   39.598219] rtw_8821cu 7-1:1.0: failed to download firmware
[   39.607580] rtw_8821cu 7-1:1.0: leave idle state failed
[   39.620462] rtw_8821cu 7-1:1.0: failed to send h2c command
[   39.651499] rtw_8821cu 7-1:1.0: failed to leave ips state
[   39.652125] rtw_8821cu 7-1:1.0: failed to leave idle state

I just built the new firmware package as well: https://dietpi.com/downloads/binaries/testing/
And yes, this solved it 👍.

To enable the 2nd Ethernet interface, e.g.:

cat << '_EOF_' > /etc/network/interfaces.d/eth1.conf
allow-hotplug eth1
iface eth1 inet static
address 192.168.1.2/24
_EOF_
ifup eth1

Just an example with a static IP assigned and no gateway. This actually better suites for the LAN-side interface, while for the WAN interface, one usually wants DHCP?

ifdown eth1
cat << '_EOF_' > /etc/network/interfaces.d/eth1.conf
allow-hotplug eth1
iface eth1 inet dhcp
_EOF_
ifup eth1

Something like that. dietpi-config only configures the eth0 interface.

@idaanx
Copy link
Author

idaanx commented Jan 22, 2024

It's been a few weeks (just before new years) when I tested Armbian and at the time the default kernel was still 6.1.x. The 6.6.x kernel was a manual install, but since they are both LTS it doesn't really matter as long as it works.

Haven't noticed any choppiness with the ssh connection during testing on my end.

Got eth1 working and added another for eth0 to not override the mac address through dietpi-config.

allow-hotplug eth0
iface eth0 inet dhcp
hwaddress ether xx:xx:xx:xx:xx:xx

I have my DHCP handing out the static IPs so no need for it in Dietpi.

@MichaIng
Copy link
Owner

Haven't noticed any choppiness with the ssh connection during testing on my end.

It was only the firmware missing for the new driver for my particular WiFi chip. Solved with the new firmware package.

and added another for eth0

In case /etc/network/interfaces contains eth0 as well, ifup might not like it with two definitions, does it? However, this is the right way indeed, so in case you might need to remove the Ethernet block from /etc/network/interfaces. A dietpi-config setting to change/set the MAC address indeed makes sense.

@MichaIng MichaIng added Solution available 🥂 Definite solution has been done and removed Priority 🔆 Investigating 🤔 labels Jan 22, 2024
MichaIng added a commit that referenced this issue Feb 20, 2024
- NanoPi R4S | Resolved an issue where Ethernet adapter of the "LAN" port could disappear after a soft reboot. Many thanks to @idaanx for reporting this issue: #6342
@MichaIng MichaIng modified the milestones: v9.1, v9.2 Feb 20, 2024
@MichaIng
Copy link
Owner

New R4S images are shipped with the new kernel, and next DietPi update will install it as well: 149a797

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External bug 🐞 For bugs which are not caused by DietPi. Kernel related 🧬 NanoPi R4S Solution available 🥂 Definite solution has been done
Projects
None yet
Development

No branches or pull requests

3 participants