Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NTP sync returns random day and time for the year 2036 #70

Closed
Uup115 opened this issue Jul 31, 2018 · 22 comments
Closed

NTP sync returns random day and time for the year 2036 #70

Uup115 opened this issue Jul 31, 2018 · 22 comments

Comments

@Uup115
Copy link

Uup115 commented Jul 31, 2018

Hello,

First of all... Great library!

I began using your library last week. Sometimes the NTP sync will return a date in the year 2036. Any thoughts as to why?

Best Regards,
Aaron

@rayvt
Copy link

rayvt commented Sep 2, 2018

Yes, agreed, great library.

I also am sometimes getting bogus dates in 2036. For example:
00:28:16 02/07/2036 Winter Time.
Unix timestamp: 2085956896

I suspect there is an NTP server that is sending bad timestamps. Normal NTP will work okay because it will discard outliers. It is a problem here, because we do not cross-check or validate.

I am looking to see if I can come up with a small/simple way to avoid this.

@foxygnu
Copy link

foxygnu commented Sep 25, 2018

Hello,

Same problem here. But i can't figure when that appends. The bug was not present some time in past ...

Anyway :

NTP server set to ntp.local
NTP time zone set to: 1
--Set daylight saving ON
Starting UDP
UDP port: 123
-- Receive NTP Response
Summer Time
Sync frequency set low
Successful NTP sync at 21:18:42 25/09/2018
Got NTP time: 21:18:42 25/09/2018
Short sync interval set to 15
Long sync interval set to 1800
Time sync started
Starting UDP
UDP port: 123
-- Receive NTP Response
Winter Time
Sync frequency set low
Successful NTP sync at 07:28:16 07/02/2036
Got NTP time: 07:28:16 07/02/2036
NTP client initialized

That's always the same date & time ... strange !

And => https://github.com/esp8266/Arduino/blob/master/tools/sdk/lwip/src/core/sntp.c

Read this :

/**

  • SNTP packet format (without optional fields)
  • Timestamps are coded as 64 bits:
    • 32 bits seconds since Jan 01, 1970, 00:00
    • 32 bits seconds fraction (0-padded)
  • For future use, if the MSB in the seconds part is set, seconds are based
  • on Feb 07, 2036, 06:28:16.
    */

See ? 07/02/2036 06:28:16 (+1h for timezone) ? :)

I think there is a problem of parsing sntp or an incompatibility between the esp sdk and the ntpclientlib ...

What is your interpretation of this ?

PS: i use a local NTP for debug purpose and i can assure you this is not a ntp server problem here.

@darkcheater
Copy link

I have the same problem..

@JosefKuchar
Copy link

Yes it's 7:28 for me and it's pretty irritating

@JosefKuchar JosefKuchar mentioned this issue Oct 27, 2018
@MarcFinns
Copy link

I have it when WiFi coverase is poor

@TomsCircuits
Copy link
Contributor

I do have the same problem. Research shows that the time/date 07:28:16 07/02/2036 is the NTP roll-over date. The time stamp in NTP is given in seconds, starting from 1.1.1900 (UNIX starts 1.1.1970).

If you start with 0xFFFFFFFF as an NTP time stamp and subtract the 70 years of offset in seconds, and then convert this to time/date you get 07-Feb-36 06:28:15 UTC. So the packet must have contained a time stamp of 0xFFFFFFFF or possibly 0. The library does not check for this at the moment (and it shouldn't have to).

@TomsCircuits
Copy link
Contributor

OK, I think this becomes clearer. I added a print statement for the NTP timestamp after decoding of the UDP packet. Here is the result:

Connected to XXXXXXXX
Got IP: 192.168.0.40
Connected: yes
Secs: 3749634515
Secs: 0
Got NTP time: 07:28:16 07/02/2036
Time Sync error: NTP server not reachable
Secs: 3749634535
Got NTP time: 15:08:55 27/10/2018
Secs: 3749634541
Got NTP time: 15:09:01 27/10/2018
Secs: 3749634547
Got NTP time: 15:09:07 27/10/2018

Looks like I do get a valid packet first. The second packet seems to be corrupt. And then the server isn't there anymore...

@TomsCircuits
Copy link
Contributor

I think I am starting to understand. There is a method called getTime. This method requests a NTP message from an NTP server and returns the new time.

But if there is a timeout, it raises a noResponse ntpEvent and then it returns 0; This 0 then gets interpreted as 02/07/2036.

I wonder, could this be fixed by always checking for noResponse?...

@TomsCircuits
Copy link
Contributor

I just found the following in RFC 958 / Network Time Protocol:

In some cases a particular timestamp may not be available, such as
when the protocol first starts up. In these cases the 64-bit field
is set to zero, indicating the value is invalid or undefined.

So I'd say that even if you do get a response, a 0-timestamp can occur and needs to be handled.

@TomsCircuits
Copy link
Contributor

I left a modified version of the example from the lib running, and it seems I wasn't quite right. Here it goes wrong:

...
Server: 129.70.132.32 Secs: 3749646701 Got NTP time: 18:31:41 27/10/2018
Server: 129.70.132.32 Secs: 0 Got NTP time: 07:28:16 07/02/2036
Time Sync error: NTP server not reachable
Time set now to: 07:28:16 07/02/2036
Server: 129.70.132.32 Secs: 3749646728 Got NTP time: 18:32:08 27/10/2018
Server: 129.70.132.32 Secs: 3749646734 Got NTP time: 18:32:14 27/10/2018
...

But here it doesn't:

...
Server: 193.225.190.6 Secs: 3749644573 Got NTP time: 17:56:13 27/10/2018
Server: 193.225.190.6 Secs: 3749644579 Got NTP time: 17:56:19 27/10/2018
Time Sync error: NTP server not reachable
Time set now to: 17:56:19 27/10/2018
Server: 193.225.190.6 Secs: 3749644600 Got NTP time: 17:56:40 27/10/2018
Server: 193.225.190.6 Secs: 3749644606 Got NTP time: 17:56:46 27/10/2018
...

In the first example, the server sends a 0 timestamp and I get the year 2036 reading. In the second example, the server times out and the time remains intact. Perhaps it isn't that difficult to fix this thing after all.

@TomsCircuits
Copy link
Contributor

OK, I think I have a fix and I am currently testing it. It isn't easy, because I had no 0-timestamp over several hours.

@mcspr
Copy link
Contributor

mcspr commented Oct 28, 2018

@TomsCircuits You might have easier time testing with dummy server returning bad timestamp.
For example, using python:

import socket, struct
server = socket.socket(AF_INET, SOCK_DGRAM)
server.bind(('0.0.0.0', 12345)) # runs server on all addresses and uses port 12345
while True:
    data, addr = server.recvfrom(1024)
    print(data) # this can be unpacked using struct.unpack
    server.sendto(addr, data)

Just needs changing default port NtpClient uses (or run as privileged user to use 123)
Mirroring sent packet gives desired result of having 0 in the response and having 2036 as synced time. Or send whatever to see how client reacts.

About unpacking the data...
Referencing
https://tools.ietf.org/html/rfc5905#page-19
https://docs.python.org/3/library/struct.html#format-characters
it is probably this?
struct.unpack('H H 3I 4Q', data)
4 timestamps in the end, 4 unsigned long longs (64bit)

struct.unpack('!H H 11I', data) (edit: byte order. and might as well cut unused fraction part)

edit: Also see https://tools.ietf.org/html/rfc5905#page-76

A kiss-o'-death packet is an NTP header with leap 0x3 (NOSYNC) and
* stratum 16 (MAXSTRAT). It tells the receiver that something
* drastic has happened, as revealed by the kiss code in the refid
* field. The NTP header fields may or may not be correct.

@JosefKuchar
Copy link

@TomsCircuits I have "better luck" with poor signal

@TomsCircuits
Copy link
Contributor

@mcspr
Thanks for giving me a head-start. I kept my test running over night and didn't get a single sync error. Doh!
So it was a good idea to create a "fake" NTP server. I don't speak python, so it is probably no beauty. But it does the trick. Perhaps it is useful for others too, so here it is:

import socket, struct

server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server.bind(('0.0.0.0', 123)) # runs server on all addresses and uses port 123

base_time = 3749642240 # 17:17:20 27/10/2018
more_time = 0
error_cnt = 0

while True:
	#receive an NTP packet and unpack it
	data, addr = server.recvfrom(1024)
	B0, B1, B2, B3, T1, T1f, T2, T2f, T3, T3f, T4, T4f = struct.unpack('!12I', data)

	print "RX message:", hex(B0), B1, B2, hex(B3), T1, T1f, T2, T2f, T3, T3f, T4, T4f

	T4 = base_time + more_time # calculate new time
	more_time = more_time + 10
	
	error_cnt = error_cnt + 1 # create errors in a loop
	if error_cnt > 6:
		error_cnt = 0
		
	# set timestamp to invalid
	if error_cnt == 3:
		T4 = 0
	
	#repackage NTP data
	data = struct.pack('!12I', B0, B1, B2, B3, T1, T1f, T2, T2f, T3, T3f, T4, T4f)

	# send 5 out of 6 times
	if error_cnt < 6:
		B0, B1, B2, B3, T1, T1f, T2, T2f, T3, T3f, T4, T4f = struct.unpack('!12I', data)
		print "TX message:", hex(B0), B1, B2, hex(B3), T1, T1f, T2, T2f, T3, T3f, T4, T4f
		server.sendto(data, addr)
	print ""

And here is an example output showing received and transmitted data including errors (timestamp left zero in 3rd packet, no reply in 6th):

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0
TX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 3749642240 0

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0
TX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 3749642250 0

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0
TX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0 <<<<<<<<<<<<Timestamp is zero

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0
TX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 3749642270 0

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0
TX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 3749642280 0

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0 <<<<<<<<<<<< no response!

RX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 0 0
TX message: 0xe30006ecL 0 0 0x314e3134 0 0 0 0 0 0 3749642300 0

On the client side I get something like this:

Starting UDP
-- Receive NTP Response
Secs: 3749642240 
Summer Time
Sync frequency set low
Successful NTP sync at 17:17:20 27/10/2018
Got NTP time: 17:17:20 27/10/2018
Starting UDP
-- Receive NTP Response
Secs: 3749642250 
Summer Time
Sync frequency set low
Successful NTP sync at 17:17:30 27/10/2018
Got NTP time: 17:17:30 27/10/2018
Starting UDP
-- Receive NTP Response
Secs: 0            <<<<<<<<<<<<<<<<<<<<<<<<<< Timestamp is zero
--Timestamp is Zero
-- No valid NTP data :-(
Time Sync error: NTP server not reachable
Time is now: 17:17:42 27/10/2018  <<<<<<<<<<<<<< But the time is not 2036 anymore
Starting UDP
-- Receive NTP Response
Secs: 3749642270 
Summer Time
Sync frequency set low
Successful NTP sync at 17:17:50 27/10/2018
Got NTP time: 17:17:50 27/10/2018
Starting UDP
-- Receive NTP Response
Secs: 3749642280 
Summer Time
Sync frequency set low
Successful NTP sync at 17:18:00 27/10/2018
Got NTP time: 17:18:00 27/10/2018
Starting UDP
-- No NTP Response :-(          <<<<<<<<<<<<<<<<<<<<<< No response
Time Sync error: NTP server not reachable
Time is now: 17:18:13 27/10/2018        <<<<<<<<<<<<<<< as before: no corrupt time
Starting UDP
-- Receive NTP Response
Secs: 3749642300 
Summer Time
Sync frequency set low
Successful NTP sync at 17:18:20 27/10/2018
Got NTP time: 17:18:20 27/10/2018

@TomsCircuits
Copy link
Contributor

So, I think there is an improvement. It is perhaps not perfect. One problem is that apparently each time things like timezone, etc. are changed, the lib starts an extra sync. And if that goes wrong, then the false data will actually be used. In most applications, I am guessing that this is no real issue.

At least, in that case the sync interval goes back to short and so the wrong date will hopefully not be used for longer periods.

A side effect is that if the timestamp is wrong during startup, the time is now set to 1/1/1970.

@gmag11
Copy link
Owner

gmag11 commented Jan 4, 2019

Hi. I've had this project somehow stopped during a long time. Now, during some days I'm trying to evolve this lib to a more smart way to synchronize. I've included AsyncUDP version on develop branch. So, request is no more blocking while wainting for response. Please test it and check if there are any 2036 problem again.

@bill-orange
Copy link

bill-orange commented Apr 1, 2019

I was not able to compile with the development version. I confess to not testing with the example. My code was running with the release version. When I tested with the development version, I got:

error: ambiguous overload for 'operator!=' (operand types are 'IPAddress' and 'int')

f (ipaddr != NULL && ntpServerIPAddress != 0)

                                          ^

What could I have done wrong? I installed ESPAsyncUDP-master to get this far. I am using LwIP vers 2 (low memory) could that be it?

@gmag11
Copy link
Owner

gmag11 commented Apr 20, 2019

Please try latest version 3.0.0-beta

@gmag11
Copy link
Owner

gmag11 commented Oct 1, 2019

As there are no further comments I close this issue. Feel free to reopen it if required.

@gmag11 gmag11 closed this as completed Oct 1, 2019
@bill-orange
Copy link

bill-orange commented Nov 4, 2019 via email

@lightning003
Copy link

@bill-orange ... yes I am also not able to force reproduce the scenario [hence del the comment], it occurred for me with other libraries as well and seems like they have same/similar issue, guess I will sync two times or verify with other ntp server to be sure - I don't see any other way,
Thanks for quick response, I will check your code as well - how it performs for me.

@dev-board-tech
Copy link

So, after 7 years nothing has been done, I'm still getting 2036 most of the time from my SRM, even if the SRM UI show the correct time.
Some suggested to point my internal devices to time.nist, but IOT's are not belonging on the internet, they need a proxy to hide them from the internet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests