-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17407: [Doc][FlightRPC] Flight/gRPC best practices #13873
Conversation
@rok Can we also add best practices and debugging sections in the Go docs too? This can be done by either a README.md in the Would you prefer for me to add it to this branch? |
Thanks for chiming in @zeroshade! I was going to ask you where to put the Go specific content :). |
@rok Sounds great! Have a look at https://tip.golang.org/doc/comment also gives a good rundown of how the format of the comment gets translated into the formatting of the rendered docs on pkg.go.dev (TL;DR it's similar to just using markdown) |
@zeroshade I've added the Go doc. Please check if it makes sense, it's my first time touching Go docs. |
@lidavidm I'm now linking to c++ from python/java/R docs which is not ideal, but duplicating text is not either. Any thoughts? |
IMO, I would think there should be a common area for the text since that is common across languages. Then, the c++ docs can include code snippets as appropriate which don't seem too relevant to the other languages. The "Arrow Flight RPC" section under "Specifications and Protocols" seems like a good place to me, since those descriptions seem to be more related to best practices of using the protocol overall. |
The code snippet could potentially be moved to a cookbook type thing too? |
What's wrong with linking to the C++ docs? The format/ page for Flight RPC is meant more for things all implementations agree on, I don't think these tips really go there |
Also, linking to the C++ docs isn't applicable in all situations, they don't apply to Java for instance. |
The same goes for Go. If we want to centralize things, we should think about where to do that. And then all the language docs can link there. (I think Sphinx has a 'tabs' extension we could use for the language-specific snippets too.) |
I don't think there's anything wrong with it. I was just thinking through alternatives since Rok didn't seem to like it.
Maybe I misread that portion of the PR, but I think it references it at the moment? I don't know the details, so not sure if some of the descriptions need disentangling |
The Java docs link to the C++ docs - but the C++ docs don't apply to Java. The bit about allocations doesn't apply to Java at all, for instance. |
options that make the most sense to me are:
|
Maybe we should consider a separate tutorial section then? Even if they're language-specific? (Well, 'tutorials and guides' or something…) The Flight tutorials could probably be made to work with the same 'tab' structure so they can cover multiple languages at once (though that should be used sparingly) |
Assuming this tab structure: Otherwise, I don't have strong opinions about "best practices" being in, or separate from, a tutorial section. |
I didn't get to this today, should hopefully get to look at it this week |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for starting this @rok! I've added a few suggestions.
docs/source/cpp/flight.rst
Outdated
When using default gRPC transport options can be passed to it via | ||
:member:`arrow::flight::FlightClientOptions::generic_options`. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is less a best practice, more of a "howto" so I think this should go in the cookbook
docs/source/cpp/flight.rst
Outdated
This is what you are seeing now. | ||
bounded thread pool -> Reject connections / requests when under load, and have | ||
clients retry with backoff. This also gives an opportunity to retry with a | ||
different node. Not everyone gets serviced but quality of service stays consistent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add code to configure these modes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like setting GRPC_ARG_MIN_RECONNECT_BACKOFF_MS
to 200ms
for bounded and 0ms
for unbounded? Perhaps another setting I'm not aware of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added GRPC_ARG_MIN_RECONNECT_BACKOFF_MS
example.
docs/source/cpp/flight.rst
Outdated
|
||
1. A stale connection can be closed using | ||
:member:`arrow::flight::FlightClientOptions::stop_token`. This requires recording the | ||
stop token at connection establishment time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs code samples
(it would be great if all code samples linked to the cookbook)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could add samples for now and they would be moved out as cookbooks become available?
Also I don't see an issue if we duplicate atomic pieces of the cookbooks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much rather they be cookbook samples so that they actually get run in CI. It's probably good to still have a short snippet here, but I worry about those kinds of things bit-rotting without them actually getting exercised like the Cookbook does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can maybe merge this then add issues for moving each of these examples to the cookbook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me on the Go side though some of the comments aren't accurate or relevant to the Go library, i've made some suggestions.
go/arrow/flight/doc.go
Outdated
// https://issues.apache.org/jira/browse/ARROW-15764 proposes a caching | ||
// optimisation for server side, but it was not yet implemented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably should mention that the caching optimization is specifically for C++ in that issue.
go/arrow/flight/doc.go
Outdated
// gRPC will spawn an unbounded number of threads for concurrent clients. Those | ||
// threads are not necessarily cleaned up (cached thread pool in java parlance). | ||
// glibc malloc clears some per thread state and the default tuning never clears | ||
// caches in some workloads. But you can explicitly tell malloc to dump caches. | ||
// See https://issues.apache.org/jira/browse/ARROW-16697 as an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would actually be incorrect for the Go implementation since it uses goroutines which do not have a 1-to-1 relationship with OS threads. It wouldn't fit the "thread pool" description. We can probably remove this entire paragraph for the Go docs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
go/arrow/flight/doc.go
Outdated
// Closing unresponsive connections | ||
// | ||
// * A stale connection can be closed using CallOptions.stop_token. This requires | ||
// recording the stop token at connection establishment time. | ||
// * Use client timeout | ||
// * here is a long standing ticket for a per-write/per-read timeout instead of a per | ||
// call timeout (https://issues.apache.org/jira/browse/ARROW-6062), but this is not | ||
// (easily) possible to implement with the blocking gRPC API. For now one can also do | ||
// something like set up a background thread that calls cancel() on a timer and have | ||
// the main thread reset the timer every time a write operation completes successfully | ||
// (that means one needs to use to_batches() + write_batch and not write_table). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some comment here about the ability to use the context.Context
that is passed to requests to implement a timeout via https://pkg.go.dev/context#WithTimeout or the ability to cancel via https://pkg.go.dev/context#WithCancel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added something, but I'm not sure I understand enough for it to make sense :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review everyone! I've gone through your comments and hopefully used them productively.
docs/source/cpp/flight.rst
Outdated
This is what you are seeing now. | ||
bounded thread pool -> Reject connections / requests when under load, and have | ||
clients retry with backoff. This also gives an opportunity to retry with a | ||
different node. Not everyone gets serviced but quality of service stays consistent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like setting GRPC_ARG_MIN_RECONNECT_BACKOFF_MS
to 200ms
for bounded and 0ms
for unbounded? Perhaps another setting I'm not aware of?
docs/source/cpp/flight.rst
Outdated
|
||
1. A stale connection can be closed using | ||
:member:`arrow::flight::FlightClientOptions::stop_token`. This requires recording the | ||
stop token at connection establishment time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could add samples for now and they would be moved out as cookbooks become available?
Also I don't see an issue if we duplicate atomic pieces of the cookbooks here.
I've addressed the feedback (hopefully all) and opened a cookbook issue (apache/arrow-cookbook#251). |
Revision: f769248 Submitted crossbow builds: ursacomputing/crossbow @ actions-f012880c10
|
docs/source/cpp/flight.rst
Outdated
gRPC's default behavior allows one server to accept many connections from many | ||
different clients, but if requests do a lot of work (as they may under Flight), | ||
the server may not be able to keep up. Configuring the server to limit the number | ||
of clients and reject requests more proactively, and configuring clients to retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you configure the server to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(IIRC, it's not possible without modifying Flight…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is referring to gRPC settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say either remove this text or have a code sample of how to set the server-side limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/source/cpp/flight.rst
Outdated
-------------------------------- | ||
|
||
1. A stale connection can be closed using | ||
:member:`arrow::flight::FlightClientOptions::stop_token`. This requires recording the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is FlightCallOptions not ClientOptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is already FlightCallOptions
just the diff is off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still see ClientOptions here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saw it here now too. Fixed.
Co-authored-by: David Li <[email protected]>
Thanks for the review @lidavidm ! I think I addressed everything and sorry for not being faster. |
@github-actions crossbow submit preview-docs |
Revision: 0fba44b Submitted crossbow builds: ursacomputing/crossbow @ actions-20a398e242
|
sorry, I'm fairly backed up, I'll try to get to this sometime this week |
No hurry on my end @lidavidm! Sorry for the slow iterations. |
docs/source/cpp/flight.rst
Outdated
gRPC's default behavior allows one server to accept many connections from many | ||
different clients, but if requests do a lot of work (as they may under Flight), | ||
the server may not be able to keep up. Configuring the server to limit the number | ||
of clients and reject requests more proactively, and configuring clients to retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say either remove this text or have a code sample of how to set the server-side limit
docs/source/cpp/flight.rst
Outdated
-------------------------------- | ||
|
||
1. A stale connection can be closed using | ||
:member:`arrow::flight::FlightClientOptions::stop_token`. This requires recording the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still see ClientOptions here
Co-authored-by: David Li <[email protected]>
@lidavidm thanks for the review! I've merged you change proposals, let me know if I missed something. |
Co-authored-by: Rok Mihevc <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I think in terms of followups:
|
@lidavidm shall I open cookbook github issues for those? |
I think you did already but if not, yeah |
Indeed: apache/arrow-cookbook#251 |
Benchmark runs are scheduled for baseline = e63a13a and contender = 2e72e0a. 2e72e0a is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
We want to provide best practices and debugging section in: [cpp flight docs](https://arrow.apache.org/docs/cpp/flight.html) [python flight docs](https://arrow.apache.org/docs/python/flight.html) [java flight docs](https://arrow.apache.org/docs/java/flight.html) [R flight docs](https://arrow.apache.org/docs/r/articles/flight.html) Lead-authored-by: Rok <[email protected]> Co-authored-by: Rok Mihevc <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: Rok Mihevc <[email protected]>
We want to provide best practices and debugging section in: [cpp flight docs](https://arrow.apache.org/docs/cpp/flight.html) [python flight docs](https://arrow.apache.org/docs/python/flight.html) [java flight docs](https://arrow.apache.org/docs/java/flight.html) [R flight docs](https://arrow.apache.org/docs/r/articles/flight.html) Lead-authored-by: Rok <[email protected]> Co-authored-by: Rok Mihevc <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: Rok Mihevc <[email protected]>
We want to provide best practices and debugging section in:
cpp flight docs
python flight docs
java flight docs
R flight docs