Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Net:HTTP instrumentation adapter #187

Merged
merged 1 commit into from
Feb 28, 2020
Merged

Add Net:HTTP instrumentation adapter #187

merged 1 commit into from
Feb 28, 2020

Conversation

grantbdev
Copy link
Contributor

Overview

Add Net::HTTP instrumentation adapter similar to the other HTTP client adapters.

Related

Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks in good shape, but I did some testing and QA on it this afternoon and found two strange things.

  • no spans on error
  • two net http spans when used with faraday

No spans on an error

I don't get a span if the request errors. Try changing the example to a non-existent domain, rescue the resulting exception and note that a span is never generated. I would expect to get a span in this situation.

begin
  Net::HTTP.get(URI('http://a32342423439.com'))
rescue => e
  puts e
end

Two Net::HTTP spans when used with Faraday

I would expect to get a span for Faraday that parents a Net::HTTP span when using both libraries, but instead I get a Faraday span that parents a Net::HTTP span, that parents yet another Net::HTTP span. To repro this you can modify the example as follows:

source 'https://rubygems.org'

gem 'faraday'
gem 'opentelemetry-adapters-net-http', path: '../../../adapters/net_http'
gem 'opentelemetry-adapters-faraday', path: '../../../adapters/faraday'
gem 'opentelemetry-api', path: '../../../api'
gem 'opentelemetry-sdk', path: '../../../sdk'

example.rb

require 'rubygems'
require 'bundler/setup'
require 'net/http'

Bundler.require

OpenTelemetry::SDK.configure do |c|
  c.use_all
end

Faraday.get('http://example.com')

When I run that I get the following output:

I, [2020-02-21T15:23:58.676087 #53484]  INFO -- : Adapter: OpenTelemetry::Adapters::Net::HTTP was successfully installed
I, [2020-02-21T15:23:58.677686 #53484]  INFO -- : Adapter: OpenTelemetry::Adapters::Faraday was successfully installed
#<struct OpenTelemetry::SDK::Trace::SpanData
 name="HTTP GET",
 kind=:client,
 status=
  #<OpenTelemetry::Trace::Status:0x00007f8c0b0133a0
   @canonical_code=0,
   @description="">,
 parent_span_id="741c4296bdf64df8",
 child_count=0,
 total_recorded_attributes=7,
 total_recorded_events=0,
 total_recorded_links=0,
 start_timestamp=2020-02-21 15:23:58 -0800,
 end_timestamp=2020-02-21 15:23:58 -0800,
 attributes=
  {"component"=>"http",
   "http.method"=>"GET",
   "http.scheme"=>"http",
   "http.target"=>"/",
   "peer.hostname"=>"example.com",
   "peer.port"=>80,
   "http.status_code"=>200},
 links=nil,
 events=nil,
 library_resource=
  #<OpenTelemetry::SDK::Resources::Resource:0x00007f8c07a01cf8
   @labels={"name"=>"OpenTelemetry::Adapters::Net::HTTP", "version"=>"0.0.0"}>,
 span_id="e0ffc8e7b94ff176",
 trace_id="b504bc89d3bbf07d984fa963da09f859",
 trace_flags=#<OpenTelemetry::Trace::TraceFlags:0x00007f8c098c34c0 @flags=1>>
#<struct OpenTelemetry::SDK::Trace::SpanData
 name="HTTP GET",
 kind=:client,
 status=
  #<OpenTelemetry::Trace::Status:0x00007f8c09113b58
   @canonical_code=0,
   @description="">,
 parent_span_id="63f0dcacea988bc0",
 child_count=0,
 total_recorded_attributes=7,
 total_recorded_events=0,
 total_recorded_links=0,
 start_timestamp=2020-02-21 15:23:58 -0800,
 end_timestamp=2020-02-21 15:23:58 -0800,
 attributes=
  {"component"=>"http",
   "http.method"=>"GET",
   "http.scheme"=>"http",
   "http.target"=>"/",
   "peer.hostname"=>"example.com",
   "peer.port"=>80,
   "http.status_code"=>200},
 links=nil,
 events=nil,
 library_resource=
  #<OpenTelemetry::SDK::Resources::Resource:0x00007f8c07a01cf8
   @labels={"name"=>"OpenTelemetry::Adapters::Net::HTTP", "version"=>"0.0.0"}>,
 span_id="741c4296bdf64df8",
 trace_id="b504bc89d3bbf07d984fa963da09f859",
 trace_flags=#<OpenTelemetry::Trace::TraceFlags:0x00007f8c098c34c0 @flags=1>>
#<struct OpenTelemetry::SDK::Trace::SpanData
 name="HTTP GET",
 kind=:client,
 status=
  #<OpenTelemetry::Trace::Status:0x00007f8c09123e18
   @canonical_code=0,
   @description="">,
 parent_span_id="0000000000000000",
 child_count=0,
 total_recorded_attributes=5,
 total_recorded_events=0,
 total_recorded_links=0,
 start_timestamp=2020-02-21 15:23:58 -0800,
 end_timestamp=2020-02-21 15:23:58 -0800,
 attributes=
  {"component"=>"http",
   "http.method"=>"GET",
   "http.url"=>"http://example.com",
   "http.status_code"=>200,
   "http.status_text"=>"OK"},
 links=nil,
 events=nil,
 library_resource=
  #<OpenTelemetry::SDK::Resources::Resource:0x00007f8c090339e0
   @labels={"name"=>"OpenTelemetry::Adapters::Faraday", "version"=>"0.0.0"}>,
 span_id="63f0dcacea988bc0",
 trace_id="b504bc89d3bbf07d984fa963da09f859",
 trace_flags=#<OpenTelemetry::Trace::TraceFlags:0x00007f8c098c34c0 @flags=1>>

I was wondering if you could look into these issues and let me know what you find?

@grantbdev
Copy link
Contributor Author

@mwear The reason you're not getting a span with that particular error is because the error occurs before the part of Net::HTTP that is instrumented.
The SocketError is encountered in the start but the instrumentation is hooked into the inner request logic:
https://github.com/ruby/ruby/blob/6ecafe0c9865c4187bbf4037fb4e4ac4115e3847/lib/net/http.rb#L481-L485

The same behavior should exist in dd-trace-rb since the same methods are being instrumented. If we need to catch this error, we would have to override a higher method. The test suite shows an example of an error that is caught and added to the span because it is in the scope of the instrumentation.

@grantbdev
Copy link
Contributor Author

grantbdev commented Feb 24, 2020

@mwear It appears that the reason why there are two Net::HTTP spans is that the request method has a recursive case that is encountered by the way Faraday is using it:
https://github.com/ruby/ruby/blob/6ecafe0c9865c4187bbf4037fb4e4ac4115e3847/lib/net/http.rb#L1468-L1474

We could skip instrumentation on the outer layer if you'd like by guarding creating a span when started? is false.


I added the span name hash feedback. Let me know if you want me to add the guard clause proposed above or make a more significant change to catch the SocketError as described in my previous comment.

@mwear
Copy link
Member

mwear commented Feb 24, 2020

Thanks for the research @grantbdev, it's much appreciated!

I think we can continue to patch the request method, with the understanding that some exceptions will happen before it's reached. We can consider adding more instrumentation around establishing a connection later, if we want to. For now, sticking to what dd-trace does seems like the pragmatic thing to do.

While we're here, I think it makes sense to address the multiple spans resulting from the recursive call to request when a connection is not been started. It seems easy enough to gate span creation on started?. Let me know if that's a reasonable ask.

@grantbdev
Copy link
Contributor Author

@mwear Span creation gated. I tried adding a test case for this scenario as well, but I think the patching that WebMock does to Net::HTTP prevents the double-span problem from being reachable in the first place since it doesn't replicate the recursion: https://github.com/bblimke/webmock/blob/bf418ec25424ed63899d9196546e05c5ba82ee70/lib/webmock/http_lib_adapters/net_http.rb#L100-L112

Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good @grantbdev! I found two really minor things on the last pass through. Once those are fixed I'll give it an approval.

Rakefile Outdated Show resolved Hide resolved
@grantbdev
Copy link
Contributor Author

@mwear Feedback addressed

Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this PR and the associated research @grantbdev. Thanks!

@mwear
Copy link
Member

mwear commented Feb 27, 2020

Last call for comments. It'd be nice to merge this tomorrow after the SIG meeting.

module Patches
# Module to prepend to Net::HTTP for instrumentation
module Instrumentation
HTTP_METHODS_TO_SPAN_NAMES = Hash.new { |h, k| h[k] = "HTTP #{k}" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned about the lazy initialization here. I'd prefer to eagerly init this and freeze the hash. I'm sort of on the fence, so just flagging the concern.

span.set_attribute('http.status_code', status_code)
span.status = OpenTelemetry::Trace::Status.http_to_status(
status_code
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any semantic conventions around identifying errors based on response codes? Or are we only flagging spans as errors if an exception is raised?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are not semantic conventions for errors at all yet. Our Span#record_error method uses a set of currently proposed conventions, with the understanding that those could change. It's currently only used for unhandled exceptions. This does bring up a good point though and that is, do we want to synthesize errors based on response codes. If yes, we should probably do this across the board in our HTTP instrumentation adapters. I'll create an issue to discuss and track this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@fbogsany fbogsany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments, but overall this LGTM.

Enhancements:

  • how to instrument connect
  • how to blacklist or whitelist propagation (this may already be a thing in the propagation layer)

@mwear
Copy link
Member

mwear commented Feb 28, 2020

@grantbdev: This failed on JRuby due to a flaky test (#170). I'd like to see this green before merging. Can you rerun the workflow? If that's not a possible a git commit --amend git push -f will trigger a new build.

@mwear
Copy link
Member

mwear commented Feb 28, 2020

Welp... it has merge conflicts now too.

Add `Net::HTTP` instrumentation adapter
based on other HTTP client adapters and the related `ddtrace` code.
@grantbdev
Copy link
Contributor Author

@mwear PR rebased and build is green.

@mwear mwear merged commit b9df39e into open-telemetry:master Feb 28, 2020
@grantbdev grantbdev deleted the gb/feature--net-http--instrumentation--67 branch February 28, 2020 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants