Dont trust reported last seen times #1

teor2345 · 2021-05-26T06:42:08Z

Here are some:

fixes for my leap second changes
extra property tests we could do
fixes for clippy

I've commented out the extra property tests, because they depend on ZcashFoundation#2203

Returning `impl IntoIterator` means that the caller will always be forced to call `.into_iter()`, and returning `impl Iterator` still allows them to call `.into_iter()` because it becomes the identity function.

Due to clock skew, the peers could end up at the front of the reconnection queue or far at the back. The solution to this is to offset the reported times by the difference between the most recent reported sight (in the remote clock) and the current time (in the local clock).

Times in the past don't have any security implications, so there's no point in trying to apply the offset to them as well.

If any of the times gossiped by a peer are in the future, apply the necessary offset to all the times gossiped by that peer. This ensures that all gossiped peers from a malicious peer are moved further back in the queue. Co-authored-by: teor <[email protected]>

If an overflow occurs, the reported `last_seen` times are either very wrong or malicious, so reject all addresses gossiped by that peer.

Use some mock gossiped peers that all have `last_seen` times in the future and check that they all have a specific offset applied to them.

Use some mock gossiped peers that all have `last_seen` times in the past and check that they don't have any changes to the `last_seen` times applied by the `validate_addrs` function.

Use some mock gossiped peers where some have `last_seen` times in the past and some have times in the future. Check that all the peers have an offset applied to them by the `validate_addrs` function. This tests if the offset is applied to all peers that a malicious peer gossiped to us.

If the most recent `last_seen` time reported by a peer is exactly the limit, the offset doesn't need to be applied because no times are in the future.

Provides a strategy for generating arbitrary `MetaAddr` instances that are created as if they have been gossiped by another peer.

Given a generated list of gossiped peers, ensure that after running the `validate_addrs` function none of the resulting peers have a `last_seen` time that's after the specified limit.

The method will be added by ZcashFoundation#2160. Co-authored-by: teor <[email protected]>

Make it clear why all peers have the time offset applied to them. Co-authored-by: teor <[email protected]>

- Make the security impact clearer and in a separate section. - Instead of listing an assumption as almost a side-note, describe it clearly inside a `Panics` section. - A sentence was previously very weird, and it was bordering being incorrect. It was updated to be simpler and clearer, while also being more precise. Co-authored-by: teor <[email protected]>

1. Validated MetaAddrs serialize without errors 2. Serialized bytes deserialize into the same MetaAddr

Those fixes should be in PR ZcashFoundation#2203.

jvff · 2021-05-27T00:53:20Z

zebra-network/src/peer_set/candidate_set/tests/prop.rs

+            //
+            // Compare timestamps, allowing an extra second, to account for `chrono` leap seconds:
+            // See https://docs.rs/chrono/0.4.19/chrono/naive/struct.NaiveTime.html#leap-second-handling
+            prop_assert!(peer.get_last_seen().timestamp() <= last_seen_limit.timestamp() + 1,


I'm not sure I understand this change. Why should the timestamps be compared instead of the time itself?

Sorry for the confusion here.

I made a follow-up comment about this change, but it seems like it got lost? (Maybe it was internet trouble.)

The chrono library stores dates to nanosecond precision. During a leap second, those nanoseconds go up to 2 million. We need to account for these extra nanoseconds in our calculations - in particular, we don't want Zebra to panic during a leap second. (Delaying things by a second should mostly be fine.)

So I made the random MetaAddr and datetime_full proptest times include nanoseconds in ZcashFoundation#2195.

But I actually don't think adjusting the tests is very robust. So I added commit bc90054 to normalise the time instead. (We only get nanoseconds from OS times. The serialized formats are all in seconds.)

jvff · 2021-05-27T00:53:23Z

zebra-network/src/peer_set/candidate_set/tests/prop.rs

+            use zebra_chain::serialization::{ZcashDeserialize, ZcashSerialize};
+
+            // Check that malicious peers can't make Zebra send bad times to other peers
+            // (after Zebra's standard sanitization)
+            let sanitized_peer = peer.sanitize();
+
+            // Check that sanitization doesn't put times in the future
+            prop_assert!(sanitized_peer.get_last_seen().timestamp() <= last_seen_limit.timestamp() + 1,
+                         "sanitized peer timestamp {} was greater than limit {}, original timestamp: {}",
+                         sanitized_peer.get_last_seen().timestamp(),
+                         last_seen_limit.timestamp(),
+                         peer.get_last_seen().timestamp());
+
+            // Check that malicious peers can't make Zebra's serialization fail
+            let addr_bytes = peer.zcash_serialize_to_vec();
+            prop_assert!(addr_bytes.is_ok(),
+                         "unexpected serialization error: {:?}, original timestamp: {}, sanitized timestamp: {}",
+                         addr_bytes,
+                         peer.get_last_seen().timestamp(),
+                         sanitized_peer.get_last_seen().timestamp());
+
+            // Assume other implementations deserialize like Zebra
+            let deserialized_peer = MetaAddr::zcash_deserialize(addr_bytes.unwrap().as_slice());
+            prop_assert!(deserialized_peer.is_ok(),
+                         "unexpected deserialization error: {:?}, original timestamp: {}, sanitized timestamp: {}",
+                         deserialized_peer,
+                         peer.get_last_seen().timestamp(),
+                         sanitized_peer.get_last_seen().timestamp());
+            let deserialized_peer = deserialized_peer.unwrap();
+
+            // Check that serialization hasn't modified the address
+            // (like the sanitized round-trip test)
+            prop_assert_eq!(sanitized_peer, deserialized_peer);
+
+            // Check that sanitization, serialization, and deserialization don't
+            // put times in the future
+            prop_assert!(deserialized_peer.get_last_seen().timestamp() <= last_seen_limit.timestamp() + 1,
+                         "deserialized peer timestamp {} was greater than limit {}, original timestamp: {}, sanitized timestamp: {}",
+                         deserialized_peer.get_last_seen().timestamp(),
+                         last_seen_limit.timestamp(),
+                         peer.get_last_seen().timestamp(),
+                         sanitized_peer.get_last_seen().timestamp());


I think I'm also missing something about these tests. Should each of them be separate #[proptest]s?

Also, should some of them be done in separate PRs? I mean, I don't understand how serialization and sanitization are related to the code to ensure the time limit. My intuition is that there should be a proptest to ensure that any arbitrary MetaAddr is serializable correctly and can be properly sanitized?

I'm still working through all of this myself.

Here's where I've got to today:

If we want to serialize the last seen time correctly, it needs to fit in a u32. But DateTime<Utc> returns seconds in i64. (And uses about ~56 bits internally.) So it might not uphold this invariant.

My intuition is that there should be a proptest to ensure that any arbitrary MetaAddr is serializable correctly and can be properly sanitized?

There is now. I wrote those tests in ZcashFoundation#2203 and they just got committed as 9f8b4f8 and 5cdcc52.

Sanitization truncates to the nearest 30 minutes. Since timestamp 0 is already at a 30 minute mark, it just happens to uphold the u32 range invariant. (Despite DateTime<Utc> not requiring that invariant.)

Also, should some of them be done in separate PRs? I mean, I don't understand how serialization and sanitization are related to the code to ensure the time limit.

The validation we're adding subtracts arbitrary times. This can move the last seen time outside the u32 range. So the MetaAddr appears to be valid, until we go to sanitize and serialize it. Then the entire connection gets closed due to the serialization error. (Which is a denial of service risk.)

I used these proptests to discover that error. (And similar errors with leap seconds.)

I think the best way to fix these bugs is to create a DateTime32(u32) type. (Which is part of ticket ZcashFoundation#2171.) Using a 32-bit internal representation will make sure we can't go outside that range.

Otherwise we will have to validate 32-bit datetimes every time we use or modify them. (In general, we try to use representations that automatically ensure our constraints.)

I'll go work on DateTime32 now.

I created DateTime32 and made MetaAddr use it in ZcashFoundation#2210

For the CandidateSet validation, here's how we could move forward:

take the last seen limit as DateTime32 (no nanoseconds or leap seconds, better proptest coverage)

panic if the current time is outside of DateTime32 (by 2038 the protocol should have been updated)

do all the calculations in chrono::DateTime (better API, no overflows or underflows because it's ~56 bits)

convert the times to DateTime32 for each address, returning an error if they're out of range

This fix modifies both the tests and the implementation.

teor2345 · 2021-05-31T22:29:33Z

This is obsoleted by other PRs.

jvff and others added 20 commits May 25, 2021 23:43

Improve ergonomics by returning impl Iterator

e7c4e76

Returning `impl IntoIterator` means that the caller will always be forced to call `.into_iter()`, and returning `impl Iterator` still allows them to call `.into_iter()` because it becomes the identity function.

Only apply offset to times in the future

eca02cc

Times in the past don't have any security implications, so there's no point in trying to apply the offset to them as well.

Handle overflow when applying offset

d52e488

If an overflow occurs, the reported `last_seen` times are either very wrong or malicious, so reject all addresses gossiped by that peer.

Test if validation offsets times in the future

9210c67

Use some mock gossiped peers that all have `last_seen` times in the future and check that they all have a specific offset applied to them.

Test if validation doesn't offset past times

11ca45a

Use some mock gossiped peers that all have `last_seen` times in the past and check that they don't have any changes to the `last_seen` times applied by the `validate_addrs` function.

Test last_seen time being equal to the limit

b13eeef

If the most recent `last_seen` time reported by a peer is exactly the limit, the offset doesn't need to be applied because no times are in the future.

Test overflow handling

47aface

Create MetaAddr::gossiped_strategy method

01be4a3

Provides a strategy for generating arbitrary `MetaAddr` instances that are created as if they have been gossiped by another peer.

Add proptest for future last_seen correction

7d8f892

Given a generated list of gossiped peers, ensure that after running the `validate_addrs` function none of the resulting peers have a `last_seen` time that's after the specified limit.

Add TODO to use future method once it's ready

ff67bb8

The method will be added by ZcashFoundation#2160. Co-authored-by: teor <[email protected]>

Add comment to describe purpose

9b83c2e

Make it clear why all peers have the time offset applied to them. Co-authored-by: teor <[email protected]>

clippy: remove needless clone and collect

afe59ab

Add property test seeds for leap-second failures

ef64c38

Modify CandidateSet tests to account for leap seconds

3ddcb94

Add extra MetaAddr validation property tests

3ec2b60

1. Validated MetaAddrs serialize without errors 2. Serialized bytes deserialize into the same MetaAddr

WIP: disable some tests that depend on other MetaAddr fixes

153f7f1

Those fixes should be in PR ZcashFoundation#2203.

teor2345 mentioned this pull request May 26, 2021

Don't trust reported last seen times ZcashFoundation/zebra#2178

Merged

2 tasks

jvff reviewed May 27, 2021

View reviewed changes

squash! Modify CandidateSet tests to account for leap seconds

bc90054

This fix modifies both the tests and the implementation.

jvff force-pushed the dont-trust-reported-last-seen-times branch from b19043c to 15a8ff0 Compare May 31, 2021 14:51

teor2345 closed this May 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dont trust reported last seen times #1

Dont trust reported last seen times #1

teor2345 commented May 26, 2021

jvff May 27, 2021

teor2345 May 27, 2021 •

edited

Loading

jvff May 27, 2021

teor2345 May 27, 2021

teor2345 May 27, 2021

teor2345 May 27, 2021 •

edited

Loading

teor2345 commented May 31, 2021

Dont trust reported last seen times #1

Dont trust reported last seen times #1

Conversation

teor2345 commented May 26, 2021

jvff May 27, 2021

Choose a reason for hiding this comment

teor2345 May 27, 2021 • edited Loading

Choose a reason for hiding this comment

jvff May 27, 2021

Choose a reason for hiding this comment

teor2345 May 27, 2021

Choose a reason for hiding this comment

teor2345 May 27, 2021

Choose a reason for hiding this comment

teor2345 May 27, 2021 • edited Loading

Choose a reason for hiding this comment

teor2345 commented May 31, 2021

teor2345 May 27, 2021 •

edited

Loading

teor2345 May 27, 2021 •

edited

Loading