-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
otelgrpc: Add metrics support to NewServerHandler and NewClientHandler #4356
otelgrpc: Add metrics support to NewServerHandler and NewClientHandler #4356
Conversation
Codecov Report
@@ Coverage Diff @@
## main #4356 +/- ##
=======================================
- Coverage 80.9% 80.8% -0.2%
=======================================
Files 150 150
Lines 10313 10366 +53
=======================================
+ Hits 8345 8376 +31
- Misses 1827 1844 +17
- Partials 141 146 +5
|
2218a89
to
43226cb
Compare
grpc.StatsHandler
support metric
b149f52
to
57d33a1
Compare
instrumentation/google.golang.org/grpc/otelgrpc/test/grpc_stats_handler_test.go
Outdated
Show resolved
Hide resolved
span.End() | ||
|
||
metricAttrs = append(metricAttrs, rpcStatusAttr) | ||
r.rpcDuration.Record(context.TODO(), int64(rs.EndTime.Sub(rs.BeginTime)), metric.WithAttributes(metricAttrs...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use the ctx
passed to handleRPC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to use ctx
, but it won't work, I'm still digging the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the cause.
The reason is that, when we add instrument in HandleRPC
function, when the rs stats.RPCStats
is type of *stats.End
and *stats.OutTrailer
, the context we get is already cancelled by in grpc-go sdk.
func (t *http2Server) finishStream(s *Stream, rst bool, rstCode http2.ErrCode, hdr *headerFrame, eosReceived bool) {
// In case stream sending and receiving are invoked in separate
// goroutines (e.g., bi-directional streaming), cancel needs to be
// called to interrupt the potential blocking on other goroutines.
s.cancel()
oldState := s.swapState(streamDone)
And the metric sdk will check if ctx.Err() is nil when record the measurement
https://github.com/open-telemetry/opentelemetry-go/blob/c047088605b454f172765621d53107f80b1e6417/sdk/metric/instrument.go#L205
func (i *int64Inst) aggregate(ctx context.Context, val int64, s attribute.Set) { // nolint:revive // okay to shadow pkg with method.
if err := ctx.Err(); err != nil {
return
}
for _, in := range i.measures {
in(ctx, val, s)
}
}
So the aggregate function will fail, the measurement will also be dropped.
So we should 1. use new context object. 2. add the measurement before *stats.End
and *stats.OutTrailer
But some metric can only be measured when it comes to *stats.End
and *stats.OutTrailer
, for example duration, requests per rpc and responses per rpc. (Since in streaming model, we can only know the final count of requests and response when it comes to *stats.End
). So I vote for the solution 1.
WDYT? @dashpole @pellared @Sovietaced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the metric sdk will check if ctx.Err() is nil when record the measurement
https://github.com/open-telemetry/opentelemetry-go/blob/c047088605b454f172765621d53107f80b1e6417/sdk/metric/instrument.go#L205
I am not convinced that this what SDK should be doing. @MrAlias WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created open-telemetry/opentelemetry-go#4671.
For now it can be a context.TODO()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a better solution: open-telemetry/opentelemetry-go#4671 (comment)
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
instrumentation/google.golang.org/grpc/otelgrpc/test/grpc_stats_handler_test.go
Outdated
Show resolved
Hide resolved
@fatsheep9146, will you still be able to work on this PR? |
grpc.StatsHandler
support metric
Sorry, last week I'm busy with my wedding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are very close 😉
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
instrumentation/google.golang.org/grpc/otelgrpc/test/grpc_stats_handler_test.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Ziqi Zhao <[email protected]>
Signed-off-by: Ziqi Zhao <[email protected]>
Signed-off-by: Ziqi Zhao <[email protected]>
Signed-off-by: Ziqi Zhao <[email protected]>
Signed-off-by: Ziqi Zhao <[email protected]>
ad22f06
to
e455fd2
Compare
Signed-off-by: Ziqi Zhao <[email protected]>
Signed-off-by: Ziqi Zhao <[email protected]>
Signed-off-by: Ziqi Zhao <[email protected]>
instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again for your contribution 🎉
fix #4316