-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(network): Verify edge on separate thread #3145
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfornet is this enough to address the issue or do we also need to fix the schedule that recomputes the graph?
@@ -67,7 +67,7 @@ async def main(): | |||
assert response.Handshake.target_peer_id.data == bytes( | |||
my_key_pair_nacl.verify_key) | |||
assert response.Handshake.listen_port == nodes[0].addr()[1] | |||
assert response.Handshake.version == handshake.Handshake.version | |||
# assert response.Handshake.version == handshake.Handshake.version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually that line shouldn't be removed, but it will fail until #3016 is merged. Will add a todo
@@ -0,0 +1,115 @@ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this test to nightly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, since it doesn't have a failure/successful condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using this test for debugging, the condition will be probably based on logs from delay-detector, will add in next PR, since we need too push this ASAP
I'm also fixing the schedule that recompute the graph, but this PR fixes our current issue. Routing table update is very cheap even with ~6K edges, it takes less than 100ms. |
@mfornet looks like it doesn't compile https://buildkite.com/nearprotocol/nearcore/builds/3027#71fc28f4-5299-4298-a4e3-be056abdc5f1 |
4045273
to
33b2269
Compare
Codecov Report
@@ Coverage Diff @@
## master #3145 +/- ##
=======================================
Coverage 87.34% 87.34%
=======================================
Files 212 212
Lines 41525 41525
=======================================
Hits 36270 36270
Misses 5255 5255 Continue to review full report at Codecov.
|
* fix(network): Verify edge on separate thread * Address comments from bowen * Fix compile error
* fix(network): Verify edge on separate thread * Address comments from bowen * Fix compile error
Sync message with large number of edges was blocking the PeerManagerActor. This was because verifying signature for each edge is a very expensive computation, for a message with 2K edges it takes ~6seconds. This computation was moved into a separate thread so the PeerManagerActor is not blocked.
TestPlan
Compile
neard
withdelay_detector_enabled
and runsaturate_routing_table.py
. Check that no messages block the PeerManagerActor for more than 50ms.