Rework PeerList, improve its API, and fix a bug in `swap_primary` #397

romac · 2020-06-30T11:07:32Z

Removes the newly assigned primary from the witnesses.
Ensures the primary is not part of any of the peer lists (witnesses, full_nodes, faulty_nodes).
Fixes a bug in swap_primary (now replace_faulty_primary) where we would iterate the first element of witnesses over and over.
Rework and generalize the PeerList for easier testing and safer API

This commit changes the PeerList type to be parametrized by the type of values associated with a PeerId. This enables easier testing of the peer list without having to construct light clientInstances.

This commit additionally introduces a safer API by taking into account the invariant associated with a PeerList, which lets us return unconditionally the value associated with the primary peer.

In turn, this simplifies the verification loop in the supervisor, which is now expressed more simply as a recursive function whose correctness is more obvious than the imperative one, at least to me.

Referenced an issue explaining the need for the change
Updated all relevant documentation in docs
Updated all code comments where relevant
Wrote tests
Updated CHANGES.md

romac · 2020-06-30T11:08:05Z

Thanks @OStevan for prompting this little refactor and reviewing it.

OStevan · 2020-06-30T11:24:19Z

@romac thanks for following up on this.

xla

As far as I can tell this looks correct. Module level tests would help asserting the right behaviour.

🕶 😼 🖍 🅿️

romac · 2020-06-30T11:42:25Z

Module level tests would help asserting the right behaviour.

Yep, working on those :) Thanks for the review!

liamsi

👏 LGTM

Will revisit after tests were added.

This commit changes the `PeerList` type to be parametrized by the type of values associated with a `PeerId`. This enables easier testing of the peer list without having to construct light client`Instance`s. This commit additionally introduces a safer API by taking into account the invariant associated with a `PeerList`, which lets us return unconditionally the value associated with the primary peer. In turn, this simplifies the verification loop in the supervisor, which is now expressed more simply as a recursive function whose correctness is more obvious than the imperative one, at least to me.

romac · 2020-07-01T16:30:58Z

Adding tests prompted me to refactor the PeerList definition and API in cbf9753.

xla

This is a great refactoring and I agree it makes the core of the verification indeed easier to follow and comprehend. My main concern is the very constrained T on the PeerList. Could it not give stronger guarantees if Instance would be a trait and PeerList expects that as its type parameter?

xla · 2020-07-01T16:34:50Z

light-client/src/peer_list.rs

-    witnesses: HashSet<PeerId>,
-    full_nodes: HashSet<PeerId>,
-    faulty_nodes: HashSet<PeerId>,
+    witnesses: BTreeSet<PeerId>,


Maybe I missed some context in a commit, but what's the motivation to move to a BTreeSet? Asking purely out of curioisity.

This is to provide a stable ordering on the peer ids, to make the whole process (more) deterministic.

Good question by the way, I failed to mention that in the commit.

xla · 2020-07-01T16:35:33Z

light-client/src/peer_list.rs

 }

-impl PeerList {
+impl<T> PeerList<T> {


Is there any constraints we want to have on T that would help make the construction of the PeerList even more safe?

I don't see any constraints which would make it safer, but perhaps I am not fully understanding/seeing what you mean by safer in this context?

I think what I'm getting at is partially what I said in #397 (review) - given this is the PeerList of the light client crate, it might not be all that beneficial to generalise entirely over T and rather constraint it to a trait (capabilities) that indicate that whatever is held as a peer value can be used as an Instance, or whatever other abstraction has the right surface for other components to rely/integrate on. It is in the same direction as making the Handle a trait, I suspect these changes to make it easier to build up the object graph in tests and safer to construct. It's mostly food for thought and can be iterated on over time.

xla · 2020-07-01T16:36:49Z

light-client/src/peer_list.rs

-pub struct PeerList {
-    instances: HashMap<PeerId, Instance>,
+pub struct PeerList<T> {
+    values: HashMap<PeerId, T>,


Why the quite generic rename here?

Since the peer list can now hold to any datatype rather than just light client instances, I thought values was more appropriate than instances, but I am open to suggestions.

xla · 2020-07-01T16:40:59Z

light-client/src/peer_list.rs

-    faulty_nodes: HashSet<PeerId>,
+    witnesses: BTreeSet<PeerId>,
+    full_nodes: BTreeSet<PeerId>,
+    faulty_nodes: BTreeSet<PeerId>,


It seems wasteful to see the exact same fields as in PeerList, why does the builder not hold on to a fresh PeerList and returning it on build instead of constructing it.

I am more or less following the internal representation used by the derive_builder crate, and didn't give it much more thought. But I agree that it's a bit wasteful and that what you suggest would work as well. I don't feel strongly either way.

xla · 2020-07-01T16:42:02Z

light-client/src/peer_list.rs

+        }
+    }
+
+    fn a() -> PeerId {


You could name these according to their use case further down.

Previous comment was not posted in the right place

romac · 2020-07-01T17:16:50Z

@xla Thanks for the review! :) See my answers inline.

This is a great refactoring and I agree it makes the core of the verification indeed easier to follow and comprehend. My main concern is the very constrained T on the PeerList. Could it not give stronger guarantees if Instance would be a trait and PeerList expects that as its type parameter?

I don't really see the value of having Instance defined as a trait and constraining T over that, since the behavior of PeerList does not depend in any way on the choice of T. My general approach to building data structures is to keep them as generic as possible, and only constrain on types if absolutely required by the definition of the struct, or on a per-method basis. Admittedly, this is a mindset I inherited from working in Haskell, where it really pays off to have very generic data structures with fully unconstrained type parameters, but it seems to be mostly shared by the Rust community as well: rust-lang/rust-clippy#1689.

Of course, if there is actual (type) safety to be gained by having a bound on T (that I am at the moment not seeing, but feel free to expand on that), I'd be happy to revisit that decision in that case.

xla

Thought you might approach it like that and it's a common way esp. for people coming from Haskell. It's a sound way of building datastructures and I'm merely challenging from the perspective of domain coherence. As we not building general container datastructures but rather specific ones, i.e. the peer list of what the light client understands as a peer.

In any case my comments are just food for thought. All my review comments are sufficiently addressed.

➿ 🐬 🎈 🗝

romac · 2020-07-01T17:26:08Z

As we not building general container datastructures but rather specific ones, i.e. the peer list of what the light client understands as a peer.

I agree, and that's why the peer list was initially specialized to Instance, but in this case having it being generic over the value associated with peers helped a lot to keep the tests concise as well, without eg. having to introduce an Instance trait.

I still wonder what you had in mind w.r.t. to safety, because I don't see any constraints on T that would improve on the current implementation, ie. PeerList<Instance> is functionally equivalent to PeerList<T> for any T. Would you mind expanding on what you had in mind, because I might very well be missing something?

liamsi · 2020-07-01T18:45:56Z

light-client/src/peer_list.rs

+        let _ = peer_list.replace_faulty_witness(d());
+        unreachable!();
+    }
+}


Thanks for adding tests 👍

liamsi · 2020-07-01T18:50:05Z

I still wonder what you had in mind w.r.t. to safety,

My understanding is that this isn't necessarily about safety per se, only that a completely generic T might be too general where an actual constraining trait could make sense. On the other hand, no particular behaviour (or capabilities) seem to be required by T, so it can be that generic.

liamsi

👏 LGTM 👍

xla · 2020-07-02T07:57:54Z

My understanding is that this isn't necessarily about safety per se, only that a completely generic T might be too general where an actual constraining trait could make sense. On the other hand, no particular behaviour (or capabilities) seem to be required by T, so it can be that generic.

Spot on, sorry @romac - what I was going at is not so much safety but communication of intent.

romac added 4 commits June 30, 2020 11:57

Simplify handling of faulty witnesses

0f84ef5

Mark primary as faulty when swapping it for a witness

cb81002

Remove new primary from witness list when swapping it in

30902bd

Enforce that the primary node is not part of any peer list

8826a03

romac added the light-client Issues/features which involve the light client label Jun 30, 2020

romac requested review from brapse and liamsi June 30, 2020 11:07

xla previously approved these changes Jun 30, 2020

View reviewed changes

romac marked this pull request as draft June 30, 2020 11:42

liamsi previously approved these changes Jun 30, 2020

View reviewed changes

OStevan mentioned this pull request Jun 30, 2020

Add list of spare and faulty nodes to PeerList #349

Merged

5 tasks

Fix PeerList::witnesses() to only return witnesses

e371271

romac dismissed stale reviews from liamsi and xla via e371271 June 30, 2020 12:39

romac marked this pull request as ready for review July 1, 2020 13:35

Merge branch 'master' into romain/simplify-witness-swap

769de5a

romac requested review from xla and liamsi July 1, 2020 13:37

romac changed the title ~~Simplify swap of primary/witness~~ Rework PeerList, improve its API, and fix a bug in swap_primary Jul 1, 2020

romac mentioned this pull request Jul 1, 2020

Multi-peer conformance tests #371

Merged

5 tasks

xla suggested changes Jul 1, 2020

View reviewed changes

xla approved these changes Jul 1, 2020

View reviewed changes

liamsi reviewed Jul 1, 2020

View reviewed changes

liamsi approved these changes Jul 1, 2020

View reviewed changes

brapse merged commit 9e6c446 into master Jul 2, 2020

brapse deleted the romain/simplify-witness-swap branch July 2, 2020 07:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework PeerList, improve its API, and fix a bug in `swap_primary` #397

Rework PeerList, improve its API, and fix a bug in `swap_primary` #397

romac commented Jun 30, 2020 •

edited

Loading

romac commented Jun 30, 2020

OStevan commented Jun 30, 2020

xla left a comment

romac commented Jun 30, 2020

liamsi left a comment •

edited

Loading

romac commented Jul 1, 2020

xla left a comment

xla Jul 1, 2020

romac Jul 1, 2020

xla Jul 1, 2020

romac Jul 1, 2020

xla Jul 1, 2020

xla Jul 1, 2020

romac Jul 1, 2020

xla Jul 1, 2020

romac Jul 1, 2020

xla Jul 1, 2020

romac Jul 1, 2020 •

edited

Loading

romac commented Jul 1, 2020 •

edited

Loading

xla left a comment

romac commented Jul 1, 2020

liamsi Jul 1, 2020 •

edited

Loading

liamsi commented Jul 1, 2020

liamsi left a comment

xla commented Jul 2, 2020

Rework PeerList, improve its API, and fix a bug in swap_primary #397

Rework PeerList, improve its API, and fix a bug in swap_primary #397

Conversation

romac commented Jun 30, 2020 • edited Loading

romac commented Jun 30, 2020

OStevan commented Jun 30, 2020

xla left a comment

Choose a reason for hiding this comment

romac commented Jun 30, 2020

liamsi left a comment • edited Loading

Choose a reason for hiding this comment

romac commented Jul 1, 2020

xla left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romac Jul 1, 2020 • edited Loading

Choose a reason for hiding this comment

romac commented Jul 1, 2020 • edited Loading

xla left a comment

Choose a reason for hiding this comment

romac commented Jul 1, 2020

liamsi Jul 1, 2020 • edited Loading

Choose a reason for hiding this comment

liamsi commented Jul 1, 2020

liamsi left a comment

Choose a reason for hiding this comment

xla commented Jul 2, 2020

Rework PeerList, improve its API, and fix a bug in `swap_primary` #397

Rework PeerList, improve its API, and fix a bug in `swap_primary` #397

romac commented Jun 30, 2020 •

edited

Loading

liamsi left a comment •

edited

Loading

romac Jul 1, 2020 •

edited

Loading

romac commented Jul 1, 2020 •

edited

Loading

liamsi Jul 1, 2020 •

edited

Loading