-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-14958: [C++][Python][FlightRPC] Implement Flight middleware for OpenTelemetry propagation #11920
Conversation
Example output, from the unit test: (The unit test will not normally print this, I just modified it to double-check the results.)
|
CC @cpcloud, this is "automatic" instrumentation for Flight/OpenTelemetry. AFAIK, this isn't generally possible in gRPC/C++. Server interceptors have no way to pass data from the interceptor to the RPC handler. (OpenCensus support is achieved by hardcoding it into the library.) Also, IIRC thread locals (i.e. the OTel Context) are not viable because the library makes no guarantee about whether RPC handlers are run on the same thread as interceptors or not. Flight works around this because the server interceptors don't use the gRPC interceptor framework; instead, the Flight RPC handlers hardcode calls to the interceptors before handing control to the Flight application. Hence, a Span started in a Flight interceptor will be active during the application's RPC handler. The OpenTelemetry/gRPC example just hardcodes a call to OpenTelemetry within the RPC handler and does not try to implement more general instrumentation. |
0b14ba8
to
c72a5c7
Compare
Quickly bump the version since it changes a few APIs we'll use (most notably for #11920). #11963 will also need updating, but the conda-forge packages need to be updated first. This does not include the fix needed for #12408, that will require another version bump. Closes #12516 from lidavidm/arrow-15789 Authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
c72a5c7
to
c19fe58
Compare
c19fe58
to
9b23aa0
Compare
@lidavidm Does this PR need reviving or should it be closed? |
I'll clean this up at some point. It mostly needs a suitable reviewer. |
16818dc
to
38eec85
Compare
It seems this needs rebasing and fixing conflicts :-( |
a7c8757
to
dd6a25a
Compare
auto context = otel::context::RuntimeContext::GetCurrent(); | ||
auto propagator = | ||
otel::context::propagation::GlobalTextMapPropagator::GetGlobalPropagator(); | ||
propagator->Inject(carrier, context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this make a copy of carrier
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
void SendingHeaders(AddCallHeaders* outgoing_headers) override { | ||
for (const auto& pair : carrier_.context_) { | ||
outgoing_headers->AddHeader(pair.first, pair.second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... if OTel is adding arbitrary key-value pairs, should these really be propagated as HTTP headers, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's designed to be transported in headers, it's just that the API doesn't tell you what they are. They're well defined in the spec (formats listed here: https://github.com/open-telemetry/opentelemetry-specification/blob/e2c2472985b17e37a25a7dc5aa0aa071e6683c98/specification/context/api-propagators.md#propagators-distribution)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, can you add a comment about that?
cpp/src/arrow/flight/middleware.h
Outdated
@@ -61,6 +60,10 @@ enum class FlightMethod : char { | |||
DoExchange = 9, | |||
}; | |||
|
|||
/// \brief Get a human-readable name for a Flight method. | |||
ARROW_FLIGHT_EXPORT | |||
std::string FlightMethodToString(FlightMethod method); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already use the ToString
convention in other places, so perhaps:
std::string FlightMethodToString(FlightMethod method); | |
std::string ToString(FlightMethod method); |
TraceKey() = default; | ||
TraceKey(std::string key, std::string value) | ||
: key(std::move(key)), value(std::move(value)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ should synthesize these constructors automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently this won't work until C++20 (https://stackoverflow.com/a/61205386/262727)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but you could try items_->push_back({std::string(key), std::string(value)})
} else { | ||
span_->SetStatus(otel::trace::StatusCode::kOk, ""); | ||
span_->SetAttribute(OTEL_GET_TRACE_ATTR(AttrRpcGrpcStatusCode), int32_t(0)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newbie question, but will this automatically end the span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll end when the Span goes out of scope, but I'll just manually end it here to be explicit.
TraceKey() = default; | ||
TraceKey(std::string key, std::string value) | ||
: key(std::move(key)), value(std::move(value)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but you could try items_->push_back({std::string(key), std::string(value)})
cpp/src/arrow/flight/types.h
Outdated
Result() = default; | ||
explicit Result(std::shared_ptr<Buffer> body) : body(std::move(body)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither this should be necessary?
|
||
void SendingHeaders(AddCallHeaders* outgoing_headers) override { | ||
for (const auto& pair : carrier_.context_) { | ||
outgoing_headers->AddHeader(pair.first, pair.second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, can you add a comment about that?
cpp/src/arrow/flight/flight_test.cc
Outdated
reinterpret_cast<TracingServerMiddleware*>(call_context.GetMiddleware("tracing")); | ||
if (!middleware) return Status::Invalid("Could not find middleware"); | ||
#ifdef ARROW_WITH_OPENTELEMETRY | ||
EXPECT_GT(middleware->GetTraceContext().size(), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you comment a bit on what this does/tests?
opentelemetry::context::propagation::GlobalTextMapPropagator::SetGlobalPropagator( | ||
opentelemetry::nostd::shared_ptr< | ||
opentelemetry::context::propagation::TextMapPropagator>( | ||
new opentelemetry::trace::propagation::HttpTraceContext())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the kind of setup that third party server code would have to write as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the application would do this as part of initializing OpenTelemetry, though generally the SDKs provide conveniences to configure this.
python/pyarrow/tests/test_flight.py
Outdated
def do_action(self, context, action): | ||
trace_context = context.get_middleware("tracing").trace_context | ||
# Don't turn this method into a generator since then it'll be | ||
# lazily evaluated, and the trace context will be lost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If trace_context
is a regular Python dict, how would it be lost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Evaluating .trace_context
is side-effectful and depends on implicit state maintained by OpenTelemetry, so if this is a generator it'll be evaluated after OpenTelemetry has already cleaned up the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment to be clearer about what goes on.
f4f464c
to
16fd5ba
Compare
Rebased, will merge assuming no further comments as this has been sitting around for a long time |
8c211f3
to
6a8660a
Compare
6a8660a
to
93ac104
Compare
Benchmark runs are scheduled for baseline = f941118 and contender = be30611. be30611 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
… OpenTelemetry propagation (apache#11920) Adds a client middleware that sends span/trace ID to the server, and a server middleware that gets the span/trace ID and starts a child span. The middleware are available in builds without OpenTelemetry, they simply do nothing. Authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
… OpenTelemetry propagation (apache#11920) Adds a client middleware that sends span/trace ID to the server, and a server middleware that gets the span/trace ID and starts a child span. The middleware are available in builds without OpenTelemetry, they simply do nothing. Authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
Adds a client middleware that sends span/trace ID to the server, and a server middleware that gets the span/trace ID and starts a child span.
The middleware are available in builds without OpenTelemetry, they simply do nothing.