Skip to content

Commit

Permalink
only count time spent in queue as part of the http_server.queue span
Browse files Browse the repository at this point in the history
If you have a request that spends 10ms in queue, and 5s being executed, you currently end up with these 2 spans in APM:

* `http_server.queue` (duration: 5.01s)
  * `rack.request` (duration: 5s)

However, that isn't really semantically correct. The queuing itself did not take 5.01s, the entire lifecycle of the request took 5.01s, with queueing making up 10ms of that time.

This is an attempt at resolving that inconsistency by introducing an outer `http.proxy.request` span, which contains `http.proxy.queue` and `rack.request` as sibling spans, rather than the current
parent/child relationship that exists between the 2 of them. It has been introduced as an opt-in feature, by setting `request_queueing: :exclude_request`.
  • Loading branch information
agrobbin committed Feb 8, 2023
1 parent b617cfe commit f0d8a19
Show file tree
Hide file tree
Showing 9 changed files with 346 additions and 83 deletions.
13 changes: 10 additions & 3 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -1562,7 +1562,7 @@ run app
| `quantize.query.obfuscate.with` | Defines the string to replace obfuscated matches with. May be a String. Option must be nested inside the `query.obfuscate` option. | `'<redacted>'` |
| `quantize.query.obfuscate.regex` | Defines the regex with which the query string will be redacted. May be a Regexp, or `:internal` to use the default internal Regexp, which redacts well-known sensitive data. Each match is redacted entirely by replacing it with `query.obfuscate.with`. Option must be nested inside the `query.obfuscate` option. | `:internal` |
| `quantize.fragment` | Defines behavior for URL fragments. May be `:show` to show URL fragments, or `nil` to remove fragments. Option must be nested inside the `quantize` option. | `nil` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. Set to `true` to enable. | `false` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. | `false` |
| `web_service_name` | Service name for frontend server request queuing spans. (e.g. `'nginx'`) | `'web-server'` |

Deprecation notice:
Expand Down Expand Up @@ -1639,7 +1639,7 @@ end
| Key | Description | Default |
| --- | ----------- | ------- |
| `distributed_tracing` | Enables [distributed tracing](#distributed-tracing) so that this service trace is connected with a trace of another service if tracing headers are received | `true` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. Set to `true` to enable. | `false` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. | `false` |
| `exception_controller` | Class or Module which identifies a custom exception controller class. Tracer provides improved error behavior when it can identify custom exception controllers. By default, without this option, it 'guesses' what a custom exception controller looks like. Providing this option aids this identification. | `nil` |
| `middleware` | Add the trace middleware to the Rails application. Set to `false` if you don't want the middleware to load. | `true` |
| `middleware_names` | Enables any short-circuited middleware requests to display the middleware name as a resource for the trace. | `false` |
Expand Down Expand Up @@ -2479,7 +2479,14 @@ server {
}
```

Then you must enable the request queuing feature, by setting `request_queuing: true`, in the integration handling the request. For Rack-based applications, see the [documentation](#rack) for details.
Then you must enable the request queuing feature. The following options are available for the `:request_queuing` configuration:

| Option | Description |
| ------------------ | ----------- |
| `:include_request` | A `http_server.queue` span will be the root span of a trace, including the total time spent processing the request *in addition* to the time spent waiting for the request to begin being processed. This is the behavior when the configuration is set to `true`. *This is deprecated behavior and will be removed in the next major release.* |
| `:exclude_request` | A `http.proxy.request` span will be the root span of a trace, with the `http.proxy.queue` child span duration representing only the time spent waiting for the request to begin being processed. |

For Rack-based applications, see the [documentation](#rack) for details.

### Processing Pipeline

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
if Datadog::DemoEnv.feature?('tracing')
c.tracing.analytics.enabled = true if Datadog::DemoEnv.feature?('analytics')

c.tracing.instrument :rails, request_queuing: true
c.tracing.instrument :rails, request_queuing: :exclude_request
c.tracing.instrument :redis, service_name: 'acme-redis'
c.tracing.instrument :resque
end
Expand Down
5 changes: 5 additions & 0 deletions lib/datadog/tracing/contrib/rack/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,15 @@ module Ext
ENV_ANALYTICS_ENABLED = 'DD_TRACE_RACK_ANALYTICS_ENABLED'.freeze
ENV_ANALYTICS_SAMPLE_RATE = 'DD_TRACE_RACK_ANALYTICS_SAMPLE_RATE'.freeze
RACK_ENV_REQUEST_SPAN = 'datadog.rack_request_span'.freeze
SPAN_HTTP_PROXY_REQUEST = 'http.proxy.request'.freeze
SPAN_HTTP_PROXY_QUEUE = 'http.proxy.queue'.freeze
SPAN_HTTP_SERVER_QUEUE = 'http_server.queue'.freeze
SPAN_REQUEST = 'rack.request'.freeze
TAG_COMPONENT = 'rack'.freeze
TAG_COMPONENT_HTTP_PROXY = 'http_proxy'.freeze
TAG_OPERATION_REQUEST = 'request'.freeze
TAG_OPERATION_HTTP_PROXY_REQUEST = 'request'.freeze
TAG_OPERATION_HTTP_PROXY_QUEUE = 'queue'.freeze
TAG_OPERATION_HTTP_SERVER_QUEUE = 'queue'.freeze
WEBSERVER_APP = 'webserver'.freeze
DEFAULT_PEER_WEBSERVER_SERVICE_NAME = 'web-server'.freeze
Expand Down
54 changes: 38 additions & 16 deletions lib/datadog/tracing/contrib/rack/middlewares.rb
Original file line number Diff line number Diff line change
Expand Up @@ -35,22 +35,30 @@ def compute_queue_time(env)
request_start = Contrib::Rack::QueueTime.get_request_start(env)
return if request_start.nil?

frontend_span = Tracing.trace(
Ext::SPAN_HTTP_SERVER_QUEUE,
span_type: Tracing::Metadata::Ext::HTTP::TYPE_PROXY,
start_time: request_start,
service: configuration[:web_service_name]
)

# Tag this span as belonging to Rack
frontend_span.set_tag(Tracing::Metadata::Ext::TAG_COMPONENT, Ext::TAG_COMPONENT)
frontend_span.set_tag(Tracing::Metadata::Ext::TAG_OPERATION, Ext::TAG_OPERATION_HTTP_SERVER_QUEUE)
frontend_span.set_tag(Tracing::Metadata::Ext::TAG_KIND, Tracing::Metadata::Ext::SpanKind::TAG_SERVER)

# Set peer service (so its not believed to belong to this app)
frontend_span.set_tag(Tracing::Metadata::Ext::TAG_PEER_SERVICE, configuration[:web_service_name])

frontend_span
case configuration[:request_queuing]
when true, :include_request # DEV: Switch `true` to `:exclude_request` in v2.0
queue_span = trace_http_server(Ext::SPAN_HTTP_SERVER_QUEUE, start_time: request_start)
# Tag this span as belonging to Rack
queue_span.set_tag(Tracing::Metadata::Ext::TAG_COMPONENT, Ext::TAG_COMPONENT)
queue_span.set_tag(Tracing::Metadata::Ext::TAG_OPERATION, Ext::TAG_OPERATION_HTTP_SERVER_QUEUE)
queue_span.set_tag(Tracing::Metadata::Ext::TAG_KIND, Tracing::Metadata::Ext::SpanKind::TAG_SERVER)
queue_span
when :exclude_request
request_span = trace_http_server(Ext::SPAN_HTTP_PROXY_REQUEST, start_time: request_start)
request_span.set_tag(Tracing::Metadata::Ext::TAG_COMPONENT, Ext::TAG_COMPONENT_HTTP_PROXY)
request_span.set_tag(Tracing::Metadata::Ext::TAG_OPERATION, Ext::TAG_OPERATION_HTTP_PROXY_REQUEST)
request_span.set_tag(Tracing::Metadata::Ext::TAG_KIND, Tracing::Metadata::Ext::SpanKind::TAG_PROXY)

queue_span = trace_http_server(Ext::SPAN_HTTP_PROXY_QUEUE, start_time: request_start)
queue_span.set_tag(Tracing::Metadata::Ext::TAG_COMPONENT, Ext::TAG_COMPONENT_HTTP_PROXY)
queue_span.set_tag(Tracing::Metadata::Ext::TAG_OPERATION, Ext::TAG_OPERATION_HTTP_PROXY_QUEUE)
queue_span.set_tag(Tracing::Metadata::Ext::TAG_KIND, Tracing::Metadata::Ext::SpanKind::TAG_PROXY)
# finish the `queue` span now to record only the time spent *in queue*,
# excluding the time spent processing the request itself
queue_span.finish

request_span
end
end

def call(env)
Expand Down Expand Up @@ -242,6 +250,20 @@ def configuration
Datadog.configuration.tracing[:rack]
end

def trace_http_server(span_name, start_time:)
span = Tracing.trace(
span_name,
span_type: Tracing::Metadata::Ext::HTTP::TYPE_PROXY,
start_time: start_time,
service: configuration[:web_service_name]
)

# Set peer service (so its not believed to belong to this app)
span.set_tag(Tracing::Metadata::Ext::TAG_PEER_SERVICE, configuration[:web_service_name])

span
end

def parse_url(env, original_env)
request_obj = ::Rack::Request.new(env)

Expand Down
8 changes: 8 additions & 0 deletions lib/datadog/tracing/contrib/rack/patcher.rb
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ def patch
)
end
end

if [true, :include_request].include?(get_option(:request_queuing))
Datadog.logger.warn(
'Rack `:request_queuing` behavior will change in the next major release.' \
'To silence this warning, migrate to the new behavior by setting `request_queuing: :exclude_request`.' \
'See docs/GettingStarted.md#http-request-queuing for more information.'
)
end
end

def get_option(option)
Expand Down
1 change: 1 addition & 0 deletions lib/datadog/tracing/metadata/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ module SQL

# @public_api
module SpanKind
TAG_PROXY = 'proxy'
TAG_SERVER = 'server'
TAG_CLIENT = 'client'
TAG_PRODUCER = 'producer'
Expand Down
106 changes: 84 additions & 22 deletions spec/datadog/tracing/contrib/rack/configuration_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -68,9 +68,8 @@
let(:queue_value) { nil }
end

shared_examples_for 'a Rack request with queuing' do
let(:queue_span) { spans.first }
let(:rack_span) { spans.last }
shared_examples_for 'a Rack request with queuing including the request' do
let(:configuration_options) { super().merge(request_queuing: :include_request) }

it 'produces a queued Rack trace' do
is_expected.to be_ok
Expand All @@ -79,17 +78,19 @@

expect(spans).to have(2).items

server_queue_span = spans[0]
rack_span = spans[1]

web_service_name = Datadog.configuration.tracing[:rack][:web_service_name]
expect(queue_span.name).to eq('http_server.queue')
expect(queue_span.span_type).to eq('proxy')
expect(queue_span.service).to eq(web_service_name)
expect(queue_span.start_time.to_i).to eq(queue_time)
expect(queue_span.get_tag(Datadog::Core::Runtime::Ext::TAG_LANG)).to be_nil
expect(queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_PEER_SERVICE)).to eq(web_service_name)
expect(queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT))
.to eq('rack')
expect(queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION))
.to eq('queue')

expect(server_queue_span.name).to eq('http_server.queue')
expect(server_queue_span.span_type).to eq('proxy')
expect(server_queue_span.service).to eq(web_service_name)
expect(server_queue_span.start_time.to_i).to eq(queue_time)
expect(server_queue_span.get_tag(Datadog::Core::Runtime::Ext::TAG_LANG)).to be_nil
expect(server_queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_PEER_SERVICE)).to eq(web_service_name)
expect(server_queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT)).to eq('rack')
expect(server_queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION)).to eq('queue')

expect(rack_span.name).to eq('rack.request')
expect(rack_span.span_type).to eq('web')
Expand All @@ -99,12 +100,58 @@
expect(rack_span.get_tag('http.status_code')).to eq('200')
expect(rack_span.get_tag('http.url')).to eq('/')
expect(rack_span.status).to eq(0)
expect(rack_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT))
.to eq('rack')
expect(rack_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION))
.to eq('request')
expect(rack_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT)).to eq('rack')
expect(rack_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION)).to eq('request')
expect(rack_span.parent_id).to eq(server_queue_span.span_id)
end
end

shared_examples_for 'a Rack request with queuing excluding the request' do
let(:configuration_options) { super().merge(request_queuing: :exclude_request) }

it 'produces a queued Rack trace' do
is_expected.to be_ok

expect(trace.resource).to eq('GET 200')

expect(spans).to have(3).items

server_request_span = spans[1]
server_queue_span = spans[0]
rack_span = spans[2]

web_service_name = Datadog.configuration.tracing[:rack][:web_service_name]

expect(server_request_span.name).to eq('http.proxy.request')
expect(server_request_span.span_type).to eq('proxy')
expect(server_request_span.service).to eq(web_service_name)
expect(server_request_span.start_time.to_i).to eq(queue_time)
expect(server_request_span.get_tag(Datadog::Core::Runtime::Ext::TAG_LANG)).to be_nil
expect(server_request_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_PEER_SERVICE)).to eq(web_service_name)
expect(server_request_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT)).to eq('http_proxy')
expect(server_request_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION)).to eq('request')

expect(server_queue_span.name).to eq('http.proxy.queue')
expect(server_queue_span.span_type).to eq('proxy')
expect(server_queue_span.service).to eq(web_service_name)
expect(server_queue_span.start_time.to_i).to eq(queue_time)
expect(server_queue_span.get_tag(Datadog::Core::Runtime::Ext::TAG_LANG)).to be_nil
expect(server_queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_PEER_SERVICE)).to eq(web_service_name)
expect(server_queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT)).to eq('http_proxy')
expect(server_queue_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION)).to eq('queue')
expect(server_queue_span.parent_id).to eq(server_request_span.span_id)

expect(queue_span.span_id).to eq(rack_span.parent_id)
expect(rack_span.name).to eq('rack.request')
expect(rack_span.span_type).to eq('web')
expect(rack_span.service).to eq(Datadog.configuration.service)
expect(rack_span.resource).to eq('GET 200')
expect(rack_span.get_tag('http.method')).to eq('GET')
expect(rack_span.get_tag('http.status_code')).to eq('200')
expect(rack_span.get_tag('http.url')).to eq('/')
expect(rack_span.status).to eq(0)
expect(rack_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_COMPONENT)).to eq('rack')
expect(rack_span.get_tag(Datadog::Tracing::Metadata::Ext::TAG_OPERATION)).to eq('request')
expect(rack_span.parent_id).to eq(server_request_span.span_id)
end
end

Expand Down Expand Up @@ -135,7 +182,7 @@
end

context 'when enabled' do
let(:configuration_options) { super().merge(request_queuing: true) }
let(:configuration_options) { super().merge(request_queuing: :exclude_request) }

context 'and a request is received' do
include_context 'an incoming HTTP request'
Expand All @@ -145,15 +192,29 @@
let(:queue_header) { 'X-Request-Start' }
end

it_behaves_like 'a Rack request with queuing'
it_behaves_like 'a Rack request with queuing including the request'
it_behaves_like 'a Rack request with queuing excluding the request'

context 'given a custom web service name' do
let(:configuration_options) { super().merge(web_service_name: web_service_name) }
let(:web_service_name) { 'nginx' }

it_behaves_like 'a Rack request with queuing' do
it_behaves_like 'a Rack request with queuing including the request' do
it 'sets the custom service name' do
is_expected.to be_ok

queue_span = spans.find { |s| s.name == Datadog::Tracing::Contrib::Rack::Ext::SPAN_HTTP_SERVER_QUEUE }

expect(queue_span.service).to eq(web_service_name)
end
end

it_behaves_like 'a Rack request with queuing excluding the request' do
it 'sets the custom service name' do
is_expected.to be_ok

queue_span = spans.find { |s| s.name == Datadog::Tracing::Contrib::Rack::Ext::SPAN_HTTP_PROXY_QUEUE }

expect(queue_span.service).to eq(web_service_name)
end
end
Expand All @@ -165,7 +226,8 @@
let(:queue_header) { 'X-Queue-Start' }
end

it_behaves_like 'a Rack request with queuing'
it_behaves_like 'a Rack request with queuing including the request'
it_behaves_like 'a Rack request with queuing excluding the request'
end

# Ensure a queuing Span is NOT created if there is a clock skew
Expand Down
53 changes: 44 additions & 9 deletions spec/datadog/tracing/contrib/rack/distributed_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -89,29 +89,64 @@
include_context 'distributed tracing headers'
it_behaves_like 'a Rack request with distributed tracing'

context 'and request_queuing is enabled' do
let(:rack_options) { super().merge(request_queuing: true, web_service_name: web_service_name) }
context 'and request_queuing is enabled including the request time' do
let(:rack_options) { super().merge(request_queuing: :include_request, web_service_name: web_service_name) }
let(:web_service_name) { 'frontend_web_server' }

let(:server_span) { spans.first }
let(:rack_span) { spans.last }

before do
header 'X-Request-Start', "t=#{Time.now.to_f}"
end

it 'contains a request_queuing span that belongs to the distributed trace' do
is_expected.to be_ok

expect(trace.sampling_priority).to eq(sampling_priority)

expect(spans).to have(2).items

expect(server_span.name).to eq('http_server.queue')
expect(server_span.trace_id).to eq(trace_id)
expect(server_span.parent_id).to eq(parent_id)
server_queue_span = spans[0]
rack_span = spans[1]

expect(server_queue_span.name).to eq('http_server.queue')
expect(server_queue_span.trace_id).to eq(trace_id)
expect(server_queue_span.parent_id).to eq(parent_id)

expect(rack_span.name).to eq('rack.request')
expect(rack_span.trace_id).to eq(trace_id)
expect(rack_span.parent_id).to eq(server_queue_span.span_id)
end
end

context 'and request_queuing is enabled excluding the request time' do
let(:rack_options) { super().merge(request_queuing: :exclude_request, web_service_name: web_service_name) }
let(:web_service_name) { 'frontend_web_server' }

before do
header 'X-Request-Start', "t=#{Time.now.to_f}"
end

it 'contains request and request_queuing spans that belongs to the distributed trace' do
is_expected.to be_ok

expect(trace.sampling_priority).to eq(sampling_priority)

expect(spans).to have(3).items

server_request_span = spans[1]
server_queue_span = spans[0]
rack_span = spans[2]

expect(server_request_span.name).to eq('http.proxy.request')
expect(server_request_span.trace_id).to eq(trace_id)
expect(server_request_span.parent_id).to eq(parent_id)

expect(server_queue_span.name).to eq('http.proxy.queue')
expect(server_queue_span.trace_id).to eq(trace_id)
expect(server_queue_span.parent_id).to eq(server_request_span.span_id)

expect(rack_span.name).to eq('rack.request')
expect(rack_span.trace_id).to eq(trace_id)
expect(rack_span.parent_id).to eq(server_span.span_id)
expect(rack_span.parent_id).to eq(server_request_span.span_id)
end
end
end
Expand Down
Loading

0 comments on commit f0d8a19

Please sign in to comment.