-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: Support concurrent getdata requests. #3203
server: Support concurrent getdata requests. #3203
Conversation
a501d6a
to
40c8ddc
Compare
numPendingGetDataReqs := len(sp.getDataQueue) | ||
if numPendingGetDataReqs+1 > maxConcurrentGetDataReqs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only note is that strictly speaking, this is racy because it could be the case that the check fails just as an item is popped off getDataQueue
, but the affordance on the buffer sizes are large enough that this shouldn't be an issue.
Also, this function (OnGetData
) is still on the same goroutine as the peer's inHandler
, so the len check and the channel send below, while not strictly atomic, are "directionally" safe (which wouldn't necessarily be true, so that's why I'm mentioning for the record).
The final code for handling an individual getData request seems simpler now as well (as opposed to those pushXX
funcs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
I originally was using a separate queue with a mutex and sync.Cond
to signal updates where the mutex protected the length checks and addition of items to the queue itself, but I decided to go with this channel-based approach instead because the code is much easier to read and, as you mentioned, it won't be an issue because I intentionally set the buffer sizes large enough. They way far exceed anything a peer not trying to do malicious things will use in practice.
Moreover, in practice, the only way the limits would be hit without the peer being banned by the logic prior to this is if the peer is intentionally misbehaving and should be disconnected in that case anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the race could be avoided with a non-blocking write to the channel. if this falls into the default case, then the queue was full.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point since it's buffered. I was originally worried about it blocking while it was busy waiting on the semaphore to send, but due to the buffer, that would only happen when the queue is full.
That said, the same situation would still apply to the overall number of items though, so I'm not sure it's worth changing it given I've already heavily tested the current logic.
40c8ddc
to
c975a84
Compare
Currently, each peer only supports one getdata request at a time and the inbound handler is blocked until all of the requested data is served. This means messages that would otherwise be fast to process, such as pings, can potentially be delayed for long periods of time which is clearly not ideal. Moreover, the existing behavior means that it is theoretically possible for a pair of peers to experience a situation where neither one can make progress because they're both waiting on each other to serve each other data which then would ultimately result in hitting idle timeouts and disconnection. In practice, the aforementioned live lock situation basically never happens currently due to a variety of other factors such as duplicate request filtering and not notifying peers about inventory they are known to have. However, the combination of factors that prevent the situation will not necessarily apply to newer data types such as those for mesh-based mixing. The aforementioned blocking behavior is a holdover from very early network code that relied on it once upon a time, but is no longer relevant. Thus, to improve overall throughput and address the root of the aforementioned concerns, this changes the semantics to serve getdata requests asynchronously so other inbound messages can be processed concurrently. This means the remote peer is then free to send other messages while waiting for the requested data to be served. Next, limits to the amount of concurrent getdata requests and the total maximum number of pending individual data item requests are now imposed in order to prevent peers from be able to consume copious amounts of memory and other malicious behavior this change would otherwise allow. The limits are enforced such that peers may now mix and match simultaneous getdata requests for varying amounts of data items so long as they do not exceed the maximum number of allowed simultaneous pending getdata messages or the maximum number of total overall pending individual data item requests. This approach of applying the limits on both dimensions offers more flexibility and the potential for future efficiency gains while still keeping memory usage bounded to reasonable limits and protecting against other malicious behavior.
c975a84
to
3aaaf3f
Compare
This requires #3201 and is rebased on #3202.
Currently, each peer only supports one
getdata
request at a time and the inbound handler is blocked until all of the requested data is served. This means messages that would otherwise be fast to process, such as pings, can potentially be delayed for long periods of time which is clearly not ideal.Moreover, the existing behavior means that it is theoretically possible for a pair of peers to experience a situation where neither one can make progress because they're both waiting on each other to serve each other data which then would ultimately result in hitting idle timeouts and disconnection.
In practice, the aforementioned live lock situation basically never happens currently due to a variety of other factors such as duplicate request filtering and not notifying peers about inventory they are known to have. However, the combination of factors that prevent the situation will not necessarily apply to newer data types such as those for mesh-based mixing.
The aforementioned blocking behavior is a holdover from very early network code that relied on it once upon a time, but is no longer relevant.
Thus, to improve overall throughput and address the root of the aforementioned concerns, this changes the semantics to serve
getdata
requests asynchronously so other inbound messages can be processed concurrently. This means the remote peer is then free to send other messages while waiting for the requested data to be served.Next, limits to the amount of concurrent
getdata
requests and the total maximum number of pending individual data item requests are now imposed in order to prevent peers from be able to consume copious amounts of memory and other malicious behavior this change would otherwise allow.The limits are enforced such that peers may now mix and match simultaneous
getdata
requests for varying amounts of data items so long as they do not exceed the maximum number of allowed simultaneous pendinggetdata
messages or the maximum number of total overall pending individual data item requests.This approach of applying the limits on both dimensions offers more flexibility and the potential for future efficiency gains while still keeping memory usage bounded to reasonable limits and protecting against other malicious behavior.