-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decentralized Signalling #618
Conversation
I'm thinking that whatever function does the connection, let's call it If the DC succeeds, it needs to cancel the signalling operation. The background/concurrent signalling operation, needs to try and find an appropriate signaller. If there is an appropriate signaller, we just start using them straight away. If there isn't an appropriate signaller, and the direct connection didn't work, we now can try to find another node that is close to the target node and try the same thing with that target node. This how we walk backwards (backtrack) from the target node towards a close enough node to connect to first, and the forward track from there towards the target node. You'd backtrack to a node that is close to the target, but also closer to your own node. However to even call
|
It seems You just have to use The |
I've replaced |
In reviewing If we add extra data to the
There are some suggestions that can help here:
|
34b15c5
to
81e9034
Compare
A quick and dirty solution that @tegefaulkes suggested is for us to only maintain connections to our closest seed node. That way, we continue to be centralized, but it will allow us to scale the seed nodes for MatrixAI/Polykey-CLI#40 |
I noticed that |
Now if It doesn't make sense to have that setting be in |
But For In It is needed here because during remapping, there could more nodes being added to a bucket greater than the required bucket limit. In that case, |
There's even an exception for this already |
I'm changing it to |
In terms of bucket refresh, I think: That is buckets are refreshed if no lookups have occurred. This condition cannot be dependent on the This is because This doesn't tell us anything about whether a bucket has had a lookup occur in the last hour. I haven't got around to checking |
A proper condition on this would have to be maintained by the We could even make it a completely in-memory data, rather than storing it into the DB. If we persist this information, this can go into the However if we do this, then Thoughts? @tegefaulkes |
Looking at
|
In |
Introduction of
It's possible they were not respecting the bucket limit. |
Ok I can see a sleep being used in Furthermore, the exact time we update our So |
Replacing
The |
@amydevs the introduction of |
I think based on this, even unsuccessful node lookups is a valid bucket operation, in which case the bucket refresh delay should be refreshed. Refresh buckets is already occurring in memory. |
8a4b405
to
62c578d
Compare
In the |
Yes, this is meant to be done by the Step 2 of #537 but #584 only completed Step 1. This was done for the sake of time, as it was deemed that expanding the NodeGraph required more consideration and many changes. |
@CMCDragonkai where we're using |
I squashed as much as I could. I left these as |
The second WIP commit could be dropped. It adds scaffolding. |
Generally my NodeGraph changes should be put together. Make sure no spurious comments are left there, I had a bunch of stuff relating to |
[ci skip]
[ci skip]
small name and commentary updates to `NodeGraph` [ci skip]
[ci skip]
This includes creating a `connectionsQueue` utility class for coordinating rate limits and shared queues between the direct connection loop and the signalled connection loop. The two loops run concurrently while sharing found data between each other. When the connection is found, any pending connections are cancelled and awaited for clean up. [ci skip]
…Node` loop Previously it used the timeout provided for the whole `findNode` operations. This means that a failing connection would take up the whole timeout. [ci skip]
[ci skip]
closest nodes timeout after 2 hours, furthest after 1 min. [ci skip]
[ci skip] wip: fixing delay names [ci skip]
Also added some basic multi-connection logic to `NodeConnectionManager` [ci skip]
[ci skip]
4b2e2fb
to
6eb23ac
Compare
[ci skip]
6eb23ac
to
37ef800
Compare
Description
This introduces decentralized signalling which allows the PK nodes to:
Issues Fixed
MDNS
integration andNodeGraph
structure expansion #537 - especially because the MDNS nodes aren't part of the node graph atm - so they represent a different source of nodes, so ICE connections may need to consider these separatelyTasks
NodeGraph.getOldestNode
is replaced withNodeGraph.getBucket
with additional optionallimit
parameter - update all testsNodeGraph.setNode
now respects theNodeGraph.nodeBucketLimit
and will throwErrorNodeGraphBucketLimit
if it goes over the limit when a new node is being set.NodeGraph.nodeBucketLimit
is no longer hardcoded, it can be set during creation/construction.NodeGraph
as these shouldn't be needed after behaviour is verified.lastUpdated
parameter toNodeGraph.setNode
, allowing one to pass down last updated time from a higher context, and the ability to get rid of sleeps by manually setting the time. It defaults toutils.getUnixtime()
.utils.sleep
usage inNodeGraph.test.ts
and now all tests execute much faster.dataDir
fromjest.config.js
which was being used byNodeManager.test.ts
but this is vestigial code, since we are supposed to always be creatingdataDir
for each test module.NodeManager
task handler IDs do not require hardcoded names, the function names do not disappear. Also when using esbuild make sure to use thekeepNames
option as true - https://esbuild.github.io/api/#keep-names (we have done this already in PK CLI)NodeManager.test.ts
and ensured that you canNodeManager.start
andNodeManager.stop
immediately afterwards, so this means stop always cancels all node manager tasks.NodeGraph
will now support multiple addresses - this provides new API to specifically deal with "contacts"remaining tasks.
NodeManager
findNode
to take apingTimeoutTime
for setting smaller timeouts for each connection attemptsticky
connection logic. This is a combo of deciding which nodes to keep long-running connections to. And to attempt re-connecting when a long-running connection drops. I need to work out some details for this. Potentially scale timeouts by the closeness metric.MDNS
functionality.5. Implement periodic updating the- differed till later, will create an issueconnectedTime
in theNodeGraph
for active nodes.Final checklist