-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumentation hooks and Prometheus metrics #299
Conversation
@@ -262,6 +276,14 @@ func (s *Server) processUnaryRPC(t transport.ServerTransport, stream *transport. | |||
} | |||
}() | |||
} | |||
monitor := s.opts.serverMonitor.NewServerMonitor(monitoring.Unary, stream.Method()) | |||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: Have you benchmarked for throughput the overhead of the additional defer call on this procedure and elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I haven't. But the gRPC codebase seems to be full of defers. Should I put a if monitor.(type) == monitoring.NoOpMonitor
?
Added the client side stream monitoring, but I am not entirely correct if I got all the edge cases. @iamqizhao can you take a look? Even if this is not gonna make it upstream, we still intend to use it internally :) |
This is awesome! Looking for it to get merged so I can implement a sink to |
"google.golang.org/grpc/codes" | ||
) | ||
|
||
type RpcType string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, see later ocmment.
On the whole this looks good and something worth having IMHO. |
@@ -111,6 +112,7 @@ func Invoke(ctx context.Context, method string, args, reply interface{}, cc *Cli | |||
return toRPCErr(err) | |||
} | |||
} | |||
monitor := cc.dopts.clientMonitor.NewClientMonitor(monitoring.Unary, method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if cc.dopts.clientMonitor != nil {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed below.
The general design looks good to me. |
Glad to hear that right after coming back from vacation :) Two things to discuss:
|
5c83df7
to
3598371
Compare
@iamqizhao I have refactored the PR to:
Could you PTAL? :) |
@iamqizhao, I understand that you guys have tons of other work. It would be at least useful to know whether this PR is considered for acceptance? We're thinking about relying on it for our SLO monitoring (using different error.Codes to differentiated between user faults and our system faults), and knowing whether this PR has a chance of being upstreamed would be incredibly useful. |
yep, this PR can be accepted. But I need to check all the points you inserted the code and have not got time to do that. Sorry about the delay. BTW, I know it is painful but can you sync your code to the latest? |
3598371
to
2bef982
Compare
@iamqizhao Great to hear that. I have rebase over the latest master. I'm wondering how to add unit tests that will make the monitoring still work correctly across major refactors such as the ones I'm currently rebasing on. Maybe retrofit some integration tests with monitoring counters? |
On Mon, Oct 5, 2015 at 12:46 AM, Michal Witkowski [email protected]
|
That's ok :) LB and naming is something we're incredibly interested in (and willing to put some manpower behind DNS SRV implementation). Can you provide some pointers regarding how to implement the things in the end2end test? |
2bef982
to
58f6b41
Compare
@iamqizhao, any updates here? we've been happily using this monitoring for the last 2 months in prod and are willing to help out getting it upstreamed :) |
@iamqizhao, any updates on the metrics API? We've been happily using this in prod for a while now :) |
+1 for updates here Actively using grpc at preyproject.com |
@iamqizhao has this been depreciated by the server side interceptor currently being reviewed internally + proposals for client side interceptors? The Boulder team at Let's Encrypt is currently working on moving away from our current RPC implementation in favor of gRPC but the lack of exposed metrics hooks is slowing us down somewhat (one of the reasons for moving to gRPC was to get rid of a bunch of non-CA code we had to maintain, including various hacks to collect client and server side metrics which we'd rather not re-implement). We'd prefer to use something native to |
The ETA is by the end of this week or early next week. Sorry about the On Wed, Apr 13, 2016 at 3:33 PM, Roland Bracewell Shoemaker <
|
I moved this implementation into a server-side interceptor under: |
Implements a simple callback-based instrumentation hooks for gRPC. The choice of instrumentation is made through
server.options
andclinetconn.DialOption
, leaving the user in full control. The default implementation is a No-Op, incurring no overhead.The Prometheus implementation counts:
statusCode
, including latency measurementsstreaming
RPCSThe serverside screenshot of the
metrics
page of prometheus showign bothstreaming
andunary
rpcs is in the related bug: #240Input needed: