Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS gets corrupted after a short time #8221

Open
1 task done
gskjold opened this issue May 17, 2023 · 43 comments
Open
1 task done

DNS gets corrupted after a short time #8221

gskjold opened this issue May 17, 2023 · 43 comments
Assignees
Labels
Area: BT&Wifi BT & Wifi related issues Status: Awaiting triage Issue is waiting for triage Type: Bug 🐛 All bugs
Milestone

Comments

@gskjold
Copy link

gskjold commented May 17, 2023

Board

Custom board with ESP32-S2 mini

Device Description

Custom board (https://amsleser.no) for reading smart meters

Hardware Configuration

GPIO10 = Voltage divider
GPIO13 = LED
GPIO14 = LED
GPIO16 = Serial RX

The following GPIO have been grounded to improve GND routing:
11, 12, 21, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 45, 46

Version

v2.0.9

IDE Name

PlatformIO

Operating System

Linux (Ubuntu)

Flash frequency

40MHz

PSRAM enabled

yes

Upload speed

460800

Description

It connects to the WiFi and has the correct IP for a little while. In my case it changes from 192.168.28.1 to 32.1.70.66 within 10 minutes of starting the code, but from what I hear from users of the project where this was brought to my attention, it seems like the time varies (https://github.com/UtilitechAS/amsreader-firmware)

This could be related to network configuration, but I am out of ideas on how to debug this.

Sketch

https://github.com/gskjold/esp32-wifi-problem


#include <WiFi.h>
#define SSID "SSID"
#define PSK "PSK"
unsigned long lastSysout = 0;

void setup() {
	Serial.begin(115200);
	Serial.setDebugOutput(true);
	WiFi.mode(WIFI_STA);
	WiFi.begin(SSID, PSK);
}
void loop() {
	unsigned long now = millis();
	if(now-lastSysout > 30000) {
		IPAddress dns1 = WiFi.dnsIP(0);
		Serial.printf("DNS: %s\n", dns1.toString().c_str());
		lastSysout = now;
	}
	yield();
}


### Debug Message

```plain
ESP-ROM:esp32s2-rc4-20191025
Build:Oct 25 2019
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3ffe6100,len:0x11c
load:0x4004c000,len:0xa28
load:0x40050000,len:0x2740
entry 0x4004c174
[   519][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 0 - WIFI_READY
[   558][V][WiFiGeneric.cpp:340] _arduino_event_cb(): STA Started
[   559][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 2 - STA_START
[   561][V][WiFiGeneric.cpp:97] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
[   634][V][WiFiGeneric.cpp:355] _arduino_event_cb(): STA Connected: SSID: no23-mob, BSSID: 82:ac:b9:61:3f:7c, Channel: 1, Auth: WPA2_PSK
[   635][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 4 - STA_CONNECTED
[   658][V][WiFiGeneric.cpp:369] _arduino_event_cb(): STA Got New IP:192.168.28.100
[   659][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[   662][D][WiFiGeneric.cpp:1098] _eventCallback(): STA IP: 192.168.28.100, MASK: 255.255.255.0, GW: 192.168.28.1
DNS: 192.168.28.1
DNS: 192.168.28.1
DNS: 192.168.28.1
DNS: 192.168.28.1
DNS: 192.168.28.1
DNS: 192.168.28.1
DNS: 192.168.28.1
DNS: 32.1.70.66
DNS: 32.1.70.66
DNS: 32.1.70.66
DNS: 32.1.70.66
DNS: 32.1.70.66
DNS: 32.1.70.66
DNS: 32.1.70.66

Other Steps to Reproduce

Insert your SSID and PSK in the code, boot and wait at least 10min (maybe more)

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@bertmelis
Copy link
Contributor

dnsIP != localIP != gatewayIP

@gskjold
Copy link
Author

gskjold commented May 18, 2023

Could you elaborate on this? The dns returned from dhcp is the same as gw, but ip is .100

@SuGlider
Copy link
Collaborator

I think it has to do with the DHCP server in the WiFi Router.
Maybe the DHCP Lease Renewal Process for a very short Lease period causes the issue...

Would it be possible that the DHCP lease time expires and some other AP/Router responds to the ESP32-S2 STA with a new DNS information?

@gskjold
Copy link
Author

gskjold commented May 18, 2023

No other dhcpv4 exists on this network, but it does have v6. Could that be it? The lease is 1hr though. I have three customers reporting the same issue, so it is not a problem spesific for my network.

@SuGlider
Copy link
Collaborator

No other dhcpv4 exists on this network, but it does have v6. Could that be it?

Not sure... you can try it by turning off the v6... just to make sure it is not the cause.


I only have a dhcpv4 here.
I have tested it for about 30 minutes and DNS IP has not changed.
I'll leave it working here for a couple hours and check if it changes.

I'll let you know.

@gskjold
Copy link
Author

gskjold commented May 18, 2023

I have two VLAN from my router, one for IoT and one for everything else. The IoT network has its own subnet and WiFi. If I switch to IoT WiFi the problem does not appear actually (just double checked). But if I put in on the "normal" WiFi, it happens within 10mins. The only difference as far as i can tell is the v6 provided by radvd on my pfsense router. There are also some difference in DTIM on WiFi, but I don't think that is the problem.

I just changed the router config to turn off v6, and the DNS on the ESP instantly changed to 0.0.0.0. I reset the device and left it running for 15 minutes. Still no change in DNS, so it could look like we are on the right track. I have to leave for work, but can leave the device running through the day.

I know that at least one of my customers have the same ISP as me, which means he also have native ipv6 on his network. He uses the ISPs standard router, which means we have different brand router and AP.

If there is anything I can do to increase debug logging from the ESP to see what it receives, please let me know.

@SuGlider
Copy link
Collaborator

I only have a dhcpv4 here. I have tested it for about 30 minutes and DNS IP has not changed. I'll leave it working here for a couple hours and check if it changes.

I'll let you know.

I have tested it for almost 8 hrs. It keeps the ESP32 DNS IP always the same.
I've a single AP WiFi IPv4 router here.
ESP32 DNS IP address is assigned by the DHCP Server in the AP (WiFi Router).

If I change it in the WiFi router, it changes in the ESP32 after DHCP leasing period expiration (renewed in the re-leasing).

@SuGlider
Copy link
Collaborator

If there is anything I can do to increase debug logging from the ESP to see what it receives, please let me know.

One way to increase the debug logging is to set verbose logging in the IDF level.
ESP32 Arduino has all IDF libraries already precompiled only with ERROR logging.
In order to change it, it is necessary to rebuild all IDF libraries or to use Arduino as IDF component and rebuild the whole project (including the IDF and Arduino source code) into a new final binary.

I think that using Arduino as IDF component may be the easiest way to go.
IDF lets you change anything using make menuconfig option.

@SuGlider SuGlider added Area: BT&Wifi BT & Wifi related issues Type: Question Only question and removed Status: Awaiting triage Issue is waiting for triage labels May 18, 2023
@SuGlider SuGlider self-assigned this May 18, 2023
@gskjold
Copy link
Author

gskjold commented May 18, 2023

I think that using Arduino as IDF component may be the easiest way to go.
IDF lets you change anything using make menuconfig option.

Thanks, I'll try this when I get home. I haven't used IDF that much, but I think I'll find my way around. I'll report back what I find

@TD-er
Copy link
Contributor

TD-er commented May 18, 2023

I have also seen this happening lately, where the DNS suddenly becomes garbage.
My router only supplies DNS1 in the DHCP, so the other one should be 'unset'.
However at some point the unset one is some random value, causing all kinds of issues due to timeouts.

Not sure how long this issue may already be present, but I'm only fairly recent aware of these things getting corrupt.

@SuGlider
Copy link
Collaborator

SuGlider commented May 18, 2023

This very interesting, @TD-er.
I'll run some testing for both ESP32 DNS IP addresses and try to reproduce the issue.

There may be some problem with LwIP layer when a new DHCP lease request happens...

@TD-er
Copy link
Contributor

TD-er commented May 18, 2023

Maybe it is something as simple as a new IP-address being generated from some uninitialized uint32_t, or trying to parse a DHCP reply which is expected to be longer.

@SuGlider
Copy link
Collaborator

SuGlider commented May 18, 2023

Arduino layer doesn't deal with DHCP client transaction (STA mode). It is only done in the IDF level through LwIP.
Arduino controls the IDF DHCP Server application when in AP mode.

@SuGlider
Copy link
Collaborator

My router only supplies DNS1 in the DHCP, so the other one should be 'unset'.

I can't leave the DNS2 IP address unset or as 0.0.0.0 in my router. I must set both DNS addresses to valid IPs.
I set the Lease Time to just 1 minute, change the DNS IP addresses in the router configuration and I can see it reflected in the ESP32 every minute. Therefore, IDF LwIP does its job correctly. I see both DNS addresses changing in the ESP32 serial monitor to the same I set in the WiFi router.

I start to think that this issue is related to how the WiFi router or the AP works... maybe not related to the ESP32.

@TD-er
Copy link
Contributor

TD-er commented May 18, 2023

I'm using a Fritzbox router here.

Can you also try to toggle WiFi mode to off and STA mode again?
When switching WiFi mode, the DNS entries get cleared.
This is a separate issue and particularly annoying when switching from WiFi to LAN for example. But maybe it can help to reproduce the issue here.
My DHCP gets locally updated every 10 minutes if I'm not mistaken. So maybe updating it every minute may make it harder to reproduce?

@SuGlider
Copy link
Collaborator

@TD-er - I have tested it by toggling OFF <--> STA every minute and displaying the DNS addresses.
It works fine. In OFF mode, the ESP32 reports DNS 0.0.0.0 for both addresses. When connected in STA mode, it displays both DNS addresses assigned by the DHCP server (my WiFi router).
If I change any DNS server IP in the WiFi router, it reflects fine into the ESP32...

No issue here.

@SuGlider
Copy link
Collaborator

SuGlider commented May 18, 2023

@TD-er @gskjold - I may have found a work around that may work for your projects.

It is possible, with Arduino 2.0.9, to set a static DNS IP address that will never change along the sketch execution and will not be subject to the DHCP Server of the LAN.
In order to do it, it is necessary to wait for the WiFi connection using DHCP (for the local IP, GW and Mask + DNS) and then change the DNS IP to whatever necessary. It won't change after the end of the lease period.

Example:

void setup() {
  WiFi.begin(SSID, PSW);
  if (WiFi.waitForConnectResult() == WL_CONNECTED) {  // default timeout is 60,000ms = 1 min
    // set the static DNS IP addresses
    IPAddress DNS1 = WiFi.gatewayIP();       // it is common to use the Gateway IP as DNS Server IP Address
    IPAddress DNS2 = IPAddress(8, 8, 8, 8);  // Google DNS server
    WiFi.config(WiFi.localIP(), WiFi.gatewayIP(), WiFi.subnetMask(), DNS1, DNS2);
  } else {
    log_e("WiFi Connection Error!");
  }
}

@gskjold
Copy link
Author

gskjold commented May 18, 2023

When I came home from work this afternoon I messed up my test and restarted the device... But I have left it running since, and it have been rocking the correct IPv4 DNS for the last three hours.

Just re-enabled IPv6 and RADVD in my router and the DNS on ESP got corrupted immediately after. It seems to me like there is something in the IPv6 router advertisement that ESP picks up and overwrites the IPv4 DNS.

Regarding the workaround, forcing a fixed DNS on my customers is not my favorite pick I'm afraid, but thanks for the tip

@TD-er
Copy link
Contributor

TD-er commented May 18, 2023

I forgot to mention I also have IPv6 here in my network (full dual stack IPv6 & IPv4)

@gskjold
Copy link
Author

gskjold commented May 18, 2023

OK, so I have some new information. Because I have used platform = https://github.com/tasmota/platform-espressif32/releases/download/2023.05.00/platform-espressif32.zip in some of my projects, this becomes the default for platform = espressif32 in all projects, so this is what I have been running all this time. Tricked by platformio sourcery..

Combined with my current radvd.conf, the problem can be reproduced, but I guess the tasmota builds are not your problem.

interface igb1 {
	AdvSendAdvert on;
	MinRtrAdvInterval 5;
	MaxRtrAdvInterval 30;
	AdvDefaultLifetime 90;
	AdvLinkMTU 1500;
	AdvDefaultPreference medium;
	prefix 2001:4642:518a::/64 {
		DeprecatePrefix on;
		AdvOnLink on;
		AdvAutonomous on;
		AdvValidLifetime 86400;
		AdvPreferredLifetime 14400;
	};
	route ::/0 {
		AdvRoutePreference medium;
		RemoveRoute on;
	};
	RDNSS 2001:4642:518a:0:20d:b9ff:fe49:cefd {
		AdvRDNSSLifetime 90;
	};
	DNSSL home.no23.cc  {
		AdvDNSSLLifetime 90;
	};
};

Having a low MinRtrAdvInterval and MaxRtrAdvInterval makes the problem occur faster

@SuGlider
Copy link
Collaborator

@Jason2866 is the mantaniner of Tasmota project. He may help with it.
@Jason2866 - Please check the comment right above this one.

@gskjold
Copy link
Author

gskjold commented May 18, 2023

Also note that 32.1.70.66 == 2001:4642

@Jason2866
Copy link
Collaborator

Jason2866 commented May 18, 2023

Pin the platform version, as described in the readme here https://github.com/platformio/platform-espressif32
Just using platform = espressif32 does mean use ANY version which is installed.
If the last installed was the Tasmota version, it will be used for every project without a pinned version.

@Jason2866
Copy link
Collaborator

Jason2866 commented May 18, 2023

Btw. we do not have issues with DNS in project Tasmota. We have some DNS handling in Tasmota code...

@Jason2866
Copy link
Collaborator

I was unclear. The DNS part is not working correctly. We have added a workaround in Tasmota code to solve the issue.

@gskjold
Copy link
Author

gskjold commented May 19, 2023

Does that mean there will be a new version of tasmota/platform-espressif32 that solved this issue? Currently on 2023-05-00, but have also tested 2023-05-01 which has the same problem

@Jason2866
Copy link
Collaborator

Jason2866 commented May 19, 2023

No, we will not dive in the very deep rabbit hole of lwip source code. We are happy we found the weird issues with Ethernet and WiFi using together we had (DHCP timeout, Ethernet got no IP) . It is caused from actual lwip code used in IDF 4.4.4. Solved by using a older lwip commit (not knowing the code part which is making the issue). The DNS issue is in all lwip versions.
Tasmota Platform 2023-05-01 version updates only this IDF components: LittleFS, Cam and DSP.
No code change in Arduino.

The lwip Ethernet / WiFi get no IP issue, is dependent of used LAN and WiFi hardware.
I can't provide a software setup only to reproduce. So i did NOT open a issue in IDF repo for.
Just can say older lwip code works 100%. Newer lwip code does ALWAYS fail in this scenario.
Anyways this issue is not related to the OP. Just wanna say understanding lwip code is everything else than easy. Doing changes needs a lot of testing.

@gskjold
Copy link
Author

gskjold commented May 19, 2023

Thank you for the input. I will find some way around it I guess

@Jason2866 Jason2866 added Type: Bug 🐛 All bugs and removed Type: Question Only question labels May 19, 2023
@gskjold
Copy link
Author

gskjold commented May 19, 2023

When the DNS Server IP goes to 32.1.70.66, does the DNS service stop working?

Yes, this is how this was discovered. Any httpclient call to a url using hostname returns -1

I have updated my test code to better display this.

[  1761][D][WiFiGeneric.cpp:1098] _eventCallback(): STA IP: 192.168.28.100, MASK: 255.255.255.0, GW: 192.168.28.1
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 192.168.28.1
hub.amsleser.no -> 13.53.201.152 (1)
DNS: 32.1.70.66
[ 15517][E][WiFiGeneric.cpp:1595] hostByName(): DNS Failed for hub.amsleser.no
hub.amsleser.no -> 0.0.0.0 (0)
DNS: 32.1.70.66
[ 22517][E][WiFiGeneric.cpp:1595] hostByName(): DNS Failed for hub.amsleser.no
hub.amsleser.no -> 0.0.0.0 (0)
DNS: 32.1.70.66

If I run WiFi.enableIpV6(), resolving hostnames works again.

@TD-er
Copy link
Contributor

TD-er commented May 19, 2023

@gskjold With what platform? The Tasmota version?

@gskjold
Copy link
Author

gskjold commented May 19, 2023

@mrengineer7777
Copy link
Collaborator

@gskjold TASMOTA included support for ipv6 a few months ago. It also uses an older version of lwip as Jason noted. Ipv6 has not been merged into this repository. Does the DHCP issue happen with our v2.0.9?

I'm currently using a modified version of a tasmota 2.0.7 branch.

@gskjold
Copy link
Author

gskjold commented May 21, 2023

When switching to platform = [email protected] which is using 2.0.8, the issue is not there. As pointed out by @Jason2866 above, this issue is caused by lwip in esp-df 4.4.4 and not actually arduino code, so the arduino version is irrelevant, any build based on idf-4.4.4 will have the issue. So technically I guess this issue belongs to the idf repo now.

@mrengineer7777
Copy link
Collaborator

2.0.7, 2.0.8, 2.0.9 are all based on IDF 4.4.4.

@gskjold
Copy link
Author

gskjold commented May 21, 2023

Then I'll take a wild stab and say that the build flags that enable IPv6 are not set in vanilla arduino-esp32, but is set in tasmota. From what I can remember digging into the lwip ipv6 code, there are a few build flags to enable to get to the code i reference here: #8221 (comment)

@Jason2866
Copy link
Collaborator

Jason2866 commented May 21, 2023

@gskjold Yes, there are more enable IPv6 options set in the Tasmota Arduino Lib Builder than in the orig. espressif Arduino Lib Builder.
So probably enabling IPv6 "fully" introduces the problem, since lwip code changes by doing this.

@mrengineer7777
Copy link
Collaborator

Possibly related #8672

@TD-er
Copy link
Contributor

TD-er commented Dec 6, 2023

Possibly related #8672

Not sure as the DNS may also get corrupted/erased whenever the WiFi is turned off.
Also any standard DHCP reply doesn't seem to have DNS records in them, so I guess that needs a second DHCP packet/request to receive?

@cziter15
Copy link
Contributor

cziter15 commented Feb 2, 2025

Is the issue still present?

@SuGlider
Copy link
Collaborator

SuGlider commented Feb 13, 2025

Is the issue still present?

Not sure. It requires testing.
This is an old issue and Networking Layer has been refactored since then.
It now supports IPv4 and IPv6 DNS natively, therefore, this issue shall be solved using the latest Arduino Core version 3.1.2.

@gskjold - Could you please revalidate this issue using the latest ESP32 Arduino Core released version?

@SuGlider SuGlider moved this from Todo to Under investigation in Arduino ESP32 Core Project Roadmap Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: BT&Wifi BT & Wifi related issues Status: Awaiting triage Issue is waiting for triage Type: Bug 🐛 All bugs
Projects
Status: Under investigation
Development

No branches or pull requests

9 participants