Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use libp2p peer events to track peer #1468

Merged
merged 7 commits into from
Aug 10, 2020
Merged

use libp2p peer events to track peer #1468

merged 7 commits into from
Aug 10, 2020

Conversation

arnetheduck
Copy link
Member

this resolves some peer counting issues that were happening because the
lifetime future in PeerInfo was unreliable (multiple PeerInfo instances
existed per peer)

In addition, this solves another race condition: when connecting to a
peer and later dialling that protocol, it is not certain that the same
connection will be used if there's a concurrent incoming peer connection
ongoing - better not make too many assumptions about who sent statuses
when.

@arnetheduck
Copy link
Member Author

depends on vacp2p/nim-libp2p#320

if result == nil:
# TODO: We should register this peer in the pool!
result = Peer.init(node, PeerInfo.init(peerId))
node.peers.withValue(peerId, peer) do:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no benefit of using withValue here. It will only lead to slightly increased code size because it's a template.

A more efficient implementation of the operation needed here is possible right now, but it's slightly awkward:

template mgetOrPutLazy*[A, B](t: Table[A, B], key: A, val: B): var B =
  type R = B

  proc setter(loc: var R): var R =
    if loc == default(R):
      loc = val
    loc

  setter(mgetOrPut(t, key, default(R)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant PR in Stew:
status-im/nim-stew#52

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly interested in have one pattern to use across the board without having to think too much about nuance - withValue comes close to offering a particular set of properties I'm often interested in:

  • no exceptions - it's a table, it's expected that the value is missing sometimes, not exceptional - I don't want to pay a hefty exception tax (specially since nim is mostly exception-unsafe)
  • doesn't require double lookups
  • doesn't require a valid value up-front if it's missing from table
  • maintains these tradeoffs for different types

mgetorputlazy looks pretty complex - there's a closure function, a template, defaults etc - doesn't really look like it'll be that much less bloat.

I dislike withValue for several reasons:

  • requires var - no read-only version
  • is indeed a template - though if it was well made, it wouldn't incur any overhead over an optimal version - I guess it's not well made, judging from your comment
  • introduces a pointer, which is annoying / breaks some safety features such as escapes

mgetorputlazy doesn't seem to hit the sweet spot of a single tool with which I can forget about the other tools (I really don't want to use a different tool just because here I happen to be dealing with a ref whose nil value is free and therefore yadayada - I'm too old to remember so many details)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, one more thing about withValue:

  • the template doesn't have return value - thus it can't be used in expressions let x = y.withValue(xx, yy): do: yy[].field; do: ..

@@ -537,13 +541,10 @@ template send*[M](r: SingleChunkResponse[M], val: auto): untyped =
doAssert UntypedResponse(r).writtenChunks == 0
sendResponseChunkObj(UntypedResponse(r), val)

proc performProtocolHandshakes*(peer: Peer) {.async.} =
var subProtocolsHandshakes = newSeqOfCap[Future[void]](allProtocols.len)
proc performProtocolHandshakes*(peer: Peer, incoming: bool) {.async.} =
for protocol in allProtocols:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here was prepared for a future where more protocols are added. If you deem this unnecessary, you need to also remove the for loop above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still supports more than one, just runs the initializers serially in case one fails, which is easier to reason about

Copy link
Member Author

@arnetheduck arnetheduck Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, when libp2p perform operations, it mutates the state of any connection object they use (for example closing it or setting fields to different values - if they are run in concurrently, there's no locking meaning the state might change in the middle of an operation, causing async races which few understand - it's like threading, without the parallelism.

# makes the incoming flag unreliable / obsolete by the time we get to
# this point - instead of making assumptions, we'll just send a status
# message redundantly.
let
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, due of a rare race condition we choose to always violate the spec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not so sure about rare, I've been encountering and reporting it for months now, and it causes peer pool miscounts - I'm not sure I understand your position - are you suggesting we should occasionally allow this to happen?

the spec mandates that the initiating connection sends a status message but does not place limits on the other end - it's free to do as it likes and send as many status messages as it wants (in theory), including this one.

Copy link
Contributor

@zah zah Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, let's merge this now as it fixes a known issue, but I think we can do better in the long term, so some TODO comments and future directions will be appreciated.

this resolves some peer counting issues that were happening because the
lifetime future in PeerInfo was unreliable (multiple PeerInfo instances
existed per peer)

In addition, this solves another race condition: when connecting to a
peer and later dialling that protocol, it is not certain that the same
connection will be used if there's a concurrent incoming peer connection
ongoing - better not make too many assumptions about who sent statuses
when.
@arnetheduck arnetheduck merged commit 936440f into devel Aug 10, 2020
@arnetheduck arnetheduck deleted the events branch August 10, 2020 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants