Skip to content

Commit

Permalink
RNTBDChannelPool design notes (#15315)
Browse files Browse the repository at this point in the history
* First draft of notes

* Addressing the comments.

* Some more

* Some more freshed of the notes.

* Refreshing it with comments
  • Loading branch information
kirankumarkolli authored Sep 21, 2020
1 parent 367cf83 commit 7faea58
Showing 1 changed file with 57 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,58 @@

/**
* A {@link ChannelPool} implementation that enforces a maximum number of concurrent direct TCP Cosmos connections.
*
* RntbdClientChannelPool: Actors
* - acquire (RntbdServiceEndpoint): acquire a channel to use
* - release (RntbdServiceEndpoint): channel usage is complete and returning it back to pool
* - Channel.closeChannel() Future: Event handling notifying the channel termination to refresh bookkeeping
* - acquisitionTimeoutTimer: channel acquisition time-out handler
* - monitoring (through RntbdServiceEndpoint): get monitoring metrics
*
* Behaviors/Expectations:
* - Bounds:
* - max requests in-flight per channelPool: MAX_CHANNELS_PER_ENDPOINT * MAX_REQUESTS_ENDPOINT (NOT A GUARANTEE)
* - AvailableChannels.size() + AcquiredChannels.size() + (connections in connecting state, i.e., connecting.get()) <= MAX_CHANNELS_PER_ENDPOINT
* - PendingAcquisition queue default-size: Max(10_000, MAX_CHANNELS_PER_ENDPOINT * MAX_REQUESTS_ENDPOINT)
* - ChannelPool executor included event-loop task: MAX_CHANNELS_PER_ENDPOINT * MAX_REQUESTS_ENDPOINT + newInFlightAcquisitions (not yet in pendingAcquisitionQueue)
* - newInFlightAcquisitions: is expected to very very short. Hard-bound to ADMINSSON_CONTROL (upstream in RntbdServiceEndpoint)
* - NewChannel vs ReUseChannel:
* - NewChannels are serially created (reasonable current state, possible future change, upstream please DON'T TAKE any dependency)
* - Will re-use an existing channel when possible (with MAX_REQUESTS_ENDPOINT attempt not GUARANTEED)
* - Channel usage fairness: fairness is attempted but not guaranteed
* - When loadFactor is > 90%, fairness is attempted by selecting Channel with less concurrency
* - Otherwise no guarantees on fairness per channel with-in bounds of MAX_REQUESTS_ENDPOINT. I.e. some channel might have high request concurrency compared to others
* - Channel serving guarantees:
* - Ordered delivery is not guaranteed (by-design)
* - Fairness is attempted but not a guarantee
* - [UNRELATED TO CHANNEL-POOL] [CURRENT DESIGN]: RntbdServiceEndpoint.write releases Channel before its usage -> acquisition order and channel user order might differ.
* - AcquisitionTimeout: if not can't be served in an expected time, fails gracefully
* - Metrics: are approximations and might be in-consistent(by-design) as well
* - EventLoop
* - ChannelPool executor might be shared across ChannelPools or Channel
*
* Design Notes:
* - channelPool.eventLoop{@Link executor}: (executes on a single & same thread, serially)
* - Each channelPool gets an EventLoop (selection is round-robin)
* - Schedule only when it can be served immediately
* - Updates and reads that depend on "strong consistency" - like whether to create a new connection or not.
* - Updates to below data structures should be done only when inside eventLoop
* - {@Link acquiredChannels}
* - {@Link availableChannels}
* - AcquisitionTimeout handling:
* - A global single threaded scheduler
* - [***] Each channel independently schedules acquisitionTimeout handlers
* - touches {@Link pendingAcquisitions} might result in impacting the fairness
* - RntbdServiceEndpoint.write:
* - Promise<Channel> might AcquisitionTimeout
* - RntbdServiceEndpoint.writeWhenConnected
* - releaseToPool immediately -> unblocks next acquisition if-any
* - **Uses Channel even after release**, in channelEventLoop [Not a functional issue but to be noted]
* - Possible that acquisition order might differ the ChannelWrite order
* - MAX_REQUESTS_ENDPOINT: Truth managed by RntbdRequestManager in Channel.Pipeline
* - RequestManager only known when the Channel process them.
* - In-flight scheduled ones are unknown -> its a SOFT BOUND
*
*/
@JsonSerialize(using = RntbdClientChannelPool.JsonSerializer.class)
public final class RntbdClientChannelPool implements ChannelPool {
Expand Down Expand Up @@ -131,7 +183,7 @@ public final class RntbdClientChannelPool implements ChannelPool {
Comparator.comparingLong((task) -> task.originalPromise.getExpiryTimeInNanos()));

private final ScheduledFuture<?> pendingAcquisitionExpirationFuture;

/**
* Initializes a newly created {@link RntbdClientChannelPool} instance.
*
Expand Down Expand Up @@ -575,7 +627,7 @@ private void acquireChannel(final ChannelPromiseWithExpiryTime promise) {
}

// make sure to retrieve the actual channel count to avoid establishing more
// TCP connections than allowed.
// TCP connections than allowed.
final int channelCount = this.channels(false);

if (channelCount < this.maxChannels) {
Expand Down Expand Up @@ -1136,7 +1188,7 @@ private Channel pollChannel() {
return first; // because this.close -> this.close0 -> this.pollChannel
}

// Only return channels as servicable here if less than maxPendingRequests
// Only return channels as servicable here if less than maxPendingRequests
// are queued on them
if (this.isChannelServiceable(first)) {
return first;
Expand All @@ -1149,7 +1201,7 @@ private Channel pollChannel() {

if (next.isActive()) {

// Only return channels as servicable here if less than maxPendingRequests
// Only return channels as servicable here if less than maxPendingRequests
// are queued on them
if (this.isChannelServiceable(next)) {
return next;
Expand Down Expand Up @@ -1547,4 +1599,4 @@ public synchronized Throwable fillInStackTrace() {

// endregion

}
}

0 comments on commit 7faea58

Please sign in to comment.