Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the reference_timestamp & reference_id be set in server only mode? #1718

Closed
o087D opened this issue Nov 23, 2024 · 15 comments · Fixed by #1823
Closed

Should the reference_timestamp & reference_id be set in server only mode? #1718

o087D opened this issue Nov 23, 2024 · 15 comments · Fixed by #1823

Comments

@o087D
Copy link
Contributor

o087D commented Nov 23, 2024

I am attempting to configure ntpd-rs as a server for a system where the system time is being conditioned by gpsd.

My configuration is:

source = []

[observability]
log-level = "info"
observation-path = "./observe"

[[server]]
listen = "[::]:123"

[synchronization]
local-stratum = 1

The response packets from ntpd making a request are parsed by wireshark as being:

Reference ID: Unidentified reference source 'XNON'
Reference Timestamp: Jan  1, 1970 00:00:00.000000000 UTC

And the error I am getting from ntpdate is:

192.168.45.100: Server dropped: Server has gone too long without sync

(I know ntpdate is deprecated, but I believe it should still work?) The offending area in ntpdate is:

		if ((server->org.l_ui - server->reftime.l_ui)
		    >= NTP_MAXAGE) {
			if (debug)
				printf("%s: Server dropped: Server has gone too long without sync\n", 
				       ntoa(&server->srcadr));
			continue;	/* too long without sync */
		}

Where NTP_MAXAGE is set to 1 day. I do not think check is present in ntpd itself.

According to RFC5905:

   Reference Timestamp: Time when the system clock was last set or
   corrected, in NTP timestamp format.

Digging around I think the response builder (https://github.com/pendulum-project/ntpd-rs/blob/main/ntp-proto/src/packet/mod.rs#L233) is settings this to default and there is no other mechanism to configure it.

Would you be amenable to a stopgap PR which set the reference_timestamp to the receive_timestamp? (i.e. in the absence of an exact time the clock was last updated we can go with something very recent, rather than the epoch)

I would also like to add a configuration option to allow the reference ID word to be set in the [[server]] section of the configuration via a string.

Finally, if the server is running with established sources the reference_timestamp could be set to the actual time of the last completed poll event - but I have not looked in to the mechanics of that and expect there to be some cross-thread shenanigans required.

If these are of use I can find some time to do the work.

@davidv1992
Copy link
Member

We don't use the reference timestamp in ntpd-rs, as putting an accurate timestamp in that field leaks information that may make certain types of attacks significantly easier. I did not realize that this field was actually used by any implementation, as the ntpv4 standard does not.

What you describe as a stopgap PR actually seems to me like a very good solution, also long term. In the very long term the field will disappear anyway, as it is currently slated to be removed in ntpv5.

@davidv1992
Copy link
Member

As for adding a configuration option for setting the reference, I think we would be open to that, though for technical reasons we should probably not have it be part of the server section, but rather the synchronization section. We may also want to have a look at how we want this to behave exactly (should it just be an initial value, or a complete override). Feel free to open a feature request issue for this.

@rnijveld
Copy link
Member

Just as an aside: you really should try and get away from ntpdate. It very much is deprecated, most modern Linux distributions no longer ship it at all (and haven't for a couple of years). It doesn't protect your system against weird time jumps at all. Even just switching to something like systemd-timesyncd would help a lot.

@o087D
Copy link
Contributor Author

o087D commented Dec 21, 2024

Just as an aside: you really should try and get away from ntpdate. It very much is deprecated, most modern Linux distributions no longer ship it at all (and haven't for a couple of years). It doesn't protect your system against weird time jumps at all. Even just switching to something like systemd-timesyncd would help a lot.

Ideally yes - hence the interest in ntpd-rs - but many of the products I work with are unlikely to get complete system updates, so compliance is with the RFC is the only thing I can go on (in the hope that their ntp solution is compliant that is - the likes of Yocto and Busybox based builds are prevelant).

In any case, the "Reference ID" is not nearly as simple as it looked on first pass - and an incomplete implementation is much worse than a technically correct solution, so this might take a little more time than I had expected!

@hart-NTP
Copy link

hart-NTP commented Jan 1, 2025

ntpdate is indeed ancient and mostly unmaintained, but it is interesting in a historical context. I suspect if you spelunk the NTP v3 (or even earlier) code you'll see it was used at least as ntpdate does as a sanity check. It may also have been used in earlier source selection logic.

Jamming the receive timestamp in seems like the wrong answer to me. That's claiming the clock was last adjusted to its source the same moment the request came in. Yes, it fixes the ntpdate issue, but that's the only good thing I can say about it.

If ntpd-rs in server mode really has no knowledge of when its clock was last synchronized, why not put in a time in the recent past and zero the fractional part? Wouldn't that satisfy old ntpdate code as well as provide enough data minimization to address the concern of easing some attack(s)? For example, the transmit time truncated to 64 seconds by masking against ~63 (or 0xffffffc0 if that suits).

@hart-NTP
Copy link

hart-NTP commented Jan 1, 2025

It looks like NIST NTP servers are already doing something similar, truncating the reference timestamp 8 bits left of the decimal point:

server 129.6.15.27, port 123
stratum 1, precision -29, leap 00, trust 000
refid [NIST], root delay 0.000244, root dispersion 0.000488
reference time:      eb1f1e00.00000000  Wed, Jan  1 2025  1:50:56.000
originate timestamp: eb1f1e60.9c51f0f0  Wed, Jan  1 2025  1:52:32.610
transmit timestamp:  eb1f1e60.9af85943  Wed, Jan  1 2025  1:52:32.605

@avijc
Copy link

avijc commented Jan 1, 2025

Actually it looks like NIST zeroes the last 7 bits of the seconds field (and the fractional part), not 8 bits. Therefore the last two hex digits of the seconds part are either 00 or 80, giving a granularity of 128 seconds for the reference timestamp.

I'll leave the exact implementation details to you, but I would much prefer that the reference timestamp would contain some reasonably sane value, perhaps one that is truncated like suggested above. I'm one of the server operators in the NTP pool, and I can imagine that some of those pool clients use all kinds of odd and outdated software to sync their time, including ntpdate. For reference, the ntpdate in CentOS 7 requires the "reference time" value in NTP responses. Yes, CentOS 7 went EOL half a year ago, but that did not suddenly wipe out all the existing C7 installations off the planet. The story could be similar in various embedded systems that were installed a decade ago or so.

@o087D
Copy link
Contributor Author

o087D commented Jan 8, 2025

I will follow their lead and zero the last 7 bits.

I started on the reference_id but very quickly discovered a "can of worms" and only got as far as documenting what the functionality should be rather than implementing any of it.

I have to allocate a bit more time to the problem, but I will get the reference_timestamp ready first as that doesn't involve md5 hashes of ipv6 addresses...

@mdavids
Copy link

mdavids commented Jan 10, 2025

FWIW: there is more software, besides ntpdate, that looks at the Reference Timestamp:

https://github.com/beevik/ntp/blob/main/ntp.go#L396

@stevesommarsntp
Copy link

RFC5905(NTPv4) appendix mentions: "... reference time not later than the transmit time." [I've seen this happen.]
Stale NTP responses may be dropped by careful clients.

Some current NTP servers set the root dispersion to a fixed value. The reference time may help identify dubious servers among those.

NIST servers vary. Their FPGA-based implementations set reference time to (int) transmit time.
Google servers sets reference time = transmit time.

I have no objections to NTP clients setting the reference time to 0 or something else in their Mode 3 requests.
Mode 4 responses should follow the spirit of RFC5905. Reference time should always be <= Transmit time and should represent the last time the clock was checked or updated.

@rnijveld
Copy link
Member

FYI: We've decided to implement this as the truncated receive timestamp (zeroing the entire fractional second portion and the 7 least significant bits as suggested above) for now. This means that the reference timestamp emitted is somewhere between 0 and 128 seconds from the receive timestamp (and a tiny bit further from the transmit timestamp).

While that is not the actual time that ntpd-rs last synchronized, it at least prevents us from leaking any information about the synchronization process while at the same time not tripping any older software that still uses the reference timestamp field, which we feel is the most important thing here.

We still think that putting the actual synchronization timestamp in there is troublesome for several reasons:

  • It doesn't work well when running as a stratum 1 server: when we're using the system clock as our source of time (for example when a PTP daemon is synchronizing the clock) we don't have any information about when or how the clock is updated at all.
  • Sending information about the synchronization timestamp if the synchronization happened through NTP gives away information about when packets are expected to be received. That might allow attackers to interrupt or even intercept NTP packets and respond with invalid timestamps at the right time.
  • In NTPv4 the reference timestamp is no longer used meaningfully anyway and in NTPv5 the field is completely removed

I hope the implementation we have right now is satisfying enough for everyone, but let us know if you still have any comments or ideas.

@stevesommarsntp
Copy link

stevesommarsntp commented Feb 15, 2025 via email

@o087D
Copy link
Contributor Author

o087D commented Feb 15, 2025

Hi @stevesommarsntp

I think fundamentally the problem is that the standard doesn't define an "unknown" state. So by setting it to 0 clients are fully compliant in considering the server as one of "not locked" or "unknown" - or even last locked at the epoc...

The ntpupdate tool and the go ntp server both would ignore any ntpd-rs server prior to the patch. And technically they were correct in doing so (the best type of correct IMO). Furthermore any currently functioning client could be 'correctly' modified to ignore ntpd-rs servers too.

So the question is - what is the least incorrect answer?

Perhaps this change could be something that is disabled via the configuration file?

Ideally we would be able to get the actual reference time from the kernel...

(5d964ca)

@stevesommarsntp
Copy link

stevesommarsntp commented Feb 17, 2025 via email

@o087D
Copy link
Contributor Author

o087D commented Feb 17, 2025

(This is one of those issues where the more you dig the worse it gets! I think there is more that could be done to make ntpd-rs 'correct' - but I am not certain how much of it is useful.)

I am not sure how an NTP client could know more than the server can?

Essentially the NTP server is re-distributing the system time, we don't know when the system time was last set, just what it is. In my case we are using a tool like gpsd to condition the system clock - but there is no communication between the time setting and the time distributions parts of the overall system.

I believe when ntpd-rs is used as a server and as a client there is still separation - an improvement could be made in that instance to record when the syscalls to set time are made.

As far as I know there is no mechanism to get the 'last updated' information from the kernel - I had a quick look and the data doesn't seem to be recorded. I could be missing something elsewhere.

The standard only states:

 Reference Timestamp: Time when the system clock was last set or
   corrected, in NTP timestamp format.

Technically, for an unsynchronized system, that could be the system boot time (the time was last set as part of the boot sequence) - but that does not seem to be what they actually intended.

An unsynchronized server should be setting the LEAP indicator to 'unknown (clock unsynchronized)' - which would then remove that server via the a selection algorithm. It should also set the Stratum to 16 (unsynchronized).

There is no definition in the RFC for a system which is synchronized but at an unknown time.

I would argue that setting the reference time to 0 (Jan 1, 1970 00:00:00.000000000 UTC) is definitely incorrect as the system was not running at that time.

Perhaps further work could be undertaken to correctly set the reference time when operating in stratum>=2, and then use the bit rounding approach when operating in stratum==1.

This will be closer to correct in most instances?

I think correctly setting the reference time in stratum 1 would either require a local interface so the primary time source can indicate synchronization status or a change to the kernel to record and access the last time the syscall to set time was made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants