-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: Streaming KV protocol #8360
Comments
I gave this a shot using two different approaches to test out how we might want to go about this in the future. Both approaches relied on server-side streaming in gRPC using the following service description: service Internal {
rpc Batch (BatchRequest) returns (stream BatchResponse) {}
} This made the server-side changes pretty straightforward because each request was still run on its own goroutine with its own stack. This would not have been the case with client-side or bidirectional streaming, which would require an extra routing layer at the node/server/replica level. The changes to the client and specifically its Iterator modelMy first approach followed gRPC's lead and turned the type ResponseStream interface {
Recv() (*BatchResponse, error)
} and type Sender interface {
Send(context.Context, roachpb.BatchRequest) ResponseStream
} Notably, gRPC's own Goroutine model:My second approach was to use goroutines and channels in a more Go-like style. However, I still wanted to maintain a proper call stack, so I only used channels for responses and retained a type Sender interface {
Send(context.Context, roachpb.BatchRequest) <-chan *roachpb.BatchResponse
} The nice part about this approach was that it required much less invasive code changes except at the boundaries of DistSender+BatchRequest Impedance MismatchThe biggest issue I ran into with either approach was in With this in mind, it wasn't clear how This is the extent of my current progress pulling on this thread. I don't have loyalty to any one approach and would love to get input from others on ways to proceed. Also, for context about why I'm interested in this, it's primarily related to transactional pipelining (this comment is probably most relevant). With some amount of response streaming, the current transactional pipelining proposal would not need to rely exclusively on |
It'd also do wonders for performance given that gRPC scales significantly better for streaming interfaces than unary requests. |
Yes, absolutely, although for these tests I was only using server-side streaming. My understanding is that the majority of the benefit of gRPC streaming is that it doesn't need to set up a new connection for each incomming request. With server-side streaming only (i.e. not bidirectional), I think we still fall back to a unique gRPC connection per request. That said, in the context of transactional pipelining, a single request/connection for a pair of (evaluation result, proposal result) responses is still a significant improvement over a request for each of the two responses. |
I don't think that applies here. The raft transport benefits from using a long-lived stream because a single RPC call is equivalent to (open stream, send message, get response, close stream). I would not expect a stream per KV RPC to perform any better than the current model. (Moving to long-lived streams could give performance improvements, but then we'd have to do a lot of the bookkeeping that GRPC does to match up requests and responses ourselves). |
I think this issue deserves to continue to exist because it has good notes. The benefits of having multiple responses per RPC I believe have only increased since txn pipelining. |
We have marked this issue as stale because it has been inactive for |
To fix #7933, we are removing an abstraction-violating optimization that allowed us to detect when a command had been proposed to raft (and was therefore likely but not guaranteed to execute before any subsequently-proposed command). We could restore this optimization if we moved the kv
Batch
request to a streaming model, and reported the status of the request as it proceeded through the system.Reporting the status of a batch could also allow us to reduce time-related retries in
DistSender
, since an accepted batch indicates that we have sent the request to a node that is up and believes it is able to handle the request. StreamingScanResponses
could also reduce the memory impact of large scans.Jira issue: CRDB-6162
The text was updated successfully, but these errors were encountered: