Rework broadcast logic #2741

roman-khimov · 2022-10-11T15:37:45Z

We have a number of queues for different purposes:
 * regular broadcast queue
 * direct p2p queue
 * high-priority queue

And two basic egress scenarios:
 * direct p2p messages (replies to requests in Server's handle* methods)
 * broadcasted messages

Low priority broadcasted messages:
 * transaction inventories
 * block inventories
 * notary inventories
 * non-consensus extensibles

High-priority broadcasted messages:
 * consensus extensibles
 * getdata transaction requests from consensus process
 * getaddr requests

P2P messages are a bit more complicated, most of the time they use p2p queue,
but extensible message requests/replies use HP queue.

Server's handle* code is run from Peer's handleIncoming, every peer has this
thread that handles incoming messages. When working with the peer it's
important to reply to requests and blocking this thread until we send (queue)
a reply is fine, if the peer is slow we just won't get anything new from
it. The queue used is irrelevant wrt this issue.

Broadcasted messages are radically different, we want them to be delivered to
many peers, but we don't care about specific ones. If it's delivered to 2/3 of
the peers we're fine, if it's delivered to more of them --- it's not an
issue. But doing this fairly is not an easy thing, current code tries performing
unblocked sends and if this doesn't yield enough results it then blocks (but
has a timeout, we can't wait indefinitely). But it does so in sequential
manner, once the peer is chosen the code will wait for it (and only it) until
timeout happens.

What can be done instead is an attempt to push the message to all of the peers
simultaneously (or close to that). If they all deliver --- OK, if some block
and wait then we can wait until _any_ of them pushes the message through (or
global timeout happens, we still can't wait forever). If we have enough
deliveries then we can cancel pending ones and it's again not an error if
these canceled threads still do their job.

This makes the system more dynamic and adds some substantial processing
overhead, but it's a networking code, any of this overhead is much lower than
the actual packet delivery time. It also allows to spread the load more
fairly, if there is any spare queue it'll get the packet and release the
broadcaster. On the next broadcast iteration another peer is more likely to be
chosen just because it didn't get a message previously (and had some time to
deliver already queued messages).

It works perfectly in tests, with optimal networking conditions we have much
better block times and TPS increases by 5-25%% depending on the scenario.

I'd go as far as to say that it fixes the original problem of #2678, because
in this particular scenario we have empty queues in ~100% of the cases and
this new logic will likely lead to 100% fan out in this case (cancelation just
won't happen fast enough). But when the load grows and there is some waiting
in the queue it will optimize out the slowest links.

We have a number of queues for different purposes: * regular broadcast queue * direct p2p queue * high-priority queue And two basic egress scenarios: * direct p2p messages (replies to requests in Server's handle* methods) * broadcasted messages Low priority broadcasted messages: * transaction inventories * block inventories * notary inventories * non-consensus extensibles High-priority broadcasted messages: * consensus extensibles * getdata transaction requests from consensus process * getaddr requests P2P messages are a bit more complicated, most of the time they use p2p queue, but extensible message requests/replies use HP queue. Server's handle* code is run from Peer's handleIncoming, every peer has this thread that handles incoming messages. When working with the peer it's important to reply to requests and blocking this thread until we send (queue) a reply is fine, if the peer is slow we just won't get anything new from it. The queue used is irrelevant wrt this issue. Broadcasted messages are radically different, we want them to be delivered to many peers, but we don't care about specific ones. If it's delivered to 2/3 of the peers we're fine, if it's delivered to more of them --- it's not an issue. But doing this fairly is not an easy thing, current code tries performing unblocked sends and if this doesn't yield enough results it then blocks (but has a timeout, we can't wait indefinitely). But it does so in sequential manner, once the peer is chosen the code will wait for it (and only it) until timeout happens. What can be done instead is an attempt to push the message to all of the peers simultaneously (or close to that). If they all deliver --- OK, if some block and wait then we can wait until _any_ of them pushes the message through (or global timeout happens, we still can't wait forever). If we have enough deliveries then we can cancel pending ones and it's again not an error if these canceled threads still do their job. This makes the system more dynamic and adds some substantial processing overhead, but it's a networking code, any of this overhead is much lower than the actual packet delivery time. It also allows to spread the load more fairly, if there is any spare queue it'll get the packet and release the broadcaster. On the next broadcast iteration another peer is more likely to be chosen just because it didn't get a message previously (and had some time to deliver already queued messages). It works perfectly in tests, with optimal networking conditions we have much better block times and TPS increases by 5-25%% depending on the scenario. I'd go as far as to say that it fixes the original problem of #2678, because in this particular scenario we have empty queues in ~100% of the cases and this new logic will likely lead to 100% fan out in this case (cancelation just won't happen fast enough). But when the load grows and there is some waiting in the queue it will optimize out the slowest links.

Otherwise we routinely get "unexpected addr received" error.

codecov · 2022-10-11T15:55:07Z

Codecov Report

Merging #2741 (8b26d94) into master (0294e2e) will decrease coverage by 0.07%.
The diff coverage is 46.15%.

❗ Current head 8b26d94 differs from pull request most recent head e1d5f18. Consider uploading reports for the commit e1d5f18 to get more accurate results

@@            Coverage Diff             @@
##           master    #2741      +/-   ##
==========================================
- Coverage   85.41%   85.33%   -0.08%     
==========================================
  Files         324      324              
  Lines       40071    40064       -7     
==========================================
- Hits        34226    34188      -38     
- Misses       4494     4525      +31     
  Partials     1351     1351

Impacted Files	Coverage Δ
pkg/network/tcp_peer.go	`27.96% <20.00%> (-0.80%)`	⬇️
pkg/network/server.go	`73.05% <70.37%> (+0.06%)`	⬆️
pkg/network/payload/mptdata.go	`81.25% <0.00%> (-18.75%)`	⬇️
pkg/services/oracle/oracle.go	`72.99% <0.00%> (-14.60%)`	⬇️
pkg/services/oracle/request.go	`58.18% <0.00%> (-5.00%)`	⬇️
pkg/core/transaction/transaction.go	`85.18% <0.00%> (-1.49%)`	⬇️
pkg/consensus/consensus.go	`73.58% <0.00%> (+0.43%)`	⬆️
pkg/network/message_string.go	`83.87% <0.00%> (+12.90%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

AnnaShaleva

Nice optimisation!

pkg/network/peer.go

pkg/network/server.go

roman-khimov added the network P2P layer label Oct 11, 2022

roman-khimov added this to the v0.99.5 milestone Oct 11, 2022

roman-khimov requested review from fyrchik and AnnaShaleva October 11, 2022 15:37

roman-khimov added 2 commits October 11, 2022 18:42

network: speculatively set GetAddrSent status

8b26d94

Otherwise we routinely get "unexpected addr received" error.

roman-khimov force-pushed the separate-broadcast-queue-handling branch from 03700f6 to 8b26d94 Compare October 11, 2022 15:42

AnnaShaleva reviewed Oct 12, 2022

View reviewed changes

pkg/network/peer.go Outdated Show resolved Hide resolved

pkg/network/peer.go Show resolved Hide resolved

pkg/network/peer.go Outdated Show resolved Hide resolved

pkg/network/server.go Show resolved Hide resolved

network: fix outdated Peer interface comments

e1d5f18

roman-khimov requested a review from AnnaShaleva October 12, 2022 07:16

AnnaShaleva approved these changes Oct 12, 2022

View reviewed changes

fyrchik approved these changes Oct 12, 2022

View reviewed changes

roman-khimov merged commit ec4983e into master Oct 12, 2022

roman-khimov deleted the separate-broadcast-queue-handling branch October 12, 2022 09:33

roman-khimov mentioned this pull request Oct 12, 2022

Configure message send logic in the network server #2678

Closed

roman-khimov mentioned this pull request Oct 21, 2022

I/O timeouts during bench test #2744

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework broadcast logic #2741

Rework broadcast logic #2741

roman-khimov commented Oct 11, 2022

codecov bot commented Oct 11, 2022 •

edited

Loading

AnnaShaleva left a comment

Rework broadcast logic #2741

Rework broadcast logic #2741

Conversation

roman-khimov commented Oct 11, 2022

codecov bot commented Oct 11, 2022 • edited Loading

Codecov Report

AnnaShaleva left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 11, 2022 •

edited

Loading