Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracing support to gRPC impls by sniffing HTTP/2 frames on underlying TCP socket #823

Merged
merged 4 commits into from
Mar 14, 2024

Conversation

jhump
Copy link
Member

@jhump jhump commented Mar 11, 2024

This adds a new tracing capability -- to wrap a net.Conn that represents an HTTP/2 connection and examine all of the HTTP/2 traffic to assemble a trace. This is used to instrument a gRPC client and server to support tracing, since they include their own HTTP/2 implementation instead of using net/http.

The only testing I've done is to set breakpoints in an IDE to make sure the trace events were flowing. I'll probably add some very basic tests of the tracing (both the new HTTP/2 stuff and the existing net/http stuff) and push them to this PR, just so we have more comfort that this stuff doesn't stays functional and doesn't regress since it is admittedly pretty complicated and not otherwise covered by tests.

This will resolve #817.

numerous fixes, mostly for more deterministic event
ordering
@jhump
Copy link
Member Author

jhump commented Mar 12, 2024

@smaye81, I just added some tests for the tracing stuff, and they caught a few things. Most of what they caught was non-deterministic ordering of events (which could even lead to events being dropped).

So this changes some of the locking in several places to make larger chunks of logic atomic -- for example the handleFrame method of the tracing net.Conn would acquire the lock, query a map, unlock, and then possibly have to acquire the lock again for a second step. Now everything is done atomically with a single lock acquisition. This was also true in the trace builder's add method -- it could do everything under lock and then, when it optionally called build from a defer function, it would re-acquire the lock. It's been restructured to only need a single lock acquisition now.

The new test is a bit of a beast, but it verifies all of the ways to instrument a client and server, both with middleware or with wrapping an HTTP/2 net.Conn socket.

@smaye81
Copy link
Member

smaye81 commented Mar 14, 2024

lgmt, ship it!

@jhump jhump merged commit ae34441 into main Mar 14, 2024
3 checks passed
@jhump jhump deleted the jh/ht2-trace branch March 14, 2024 04:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No traces generated from grpc client/server impls
2 participants