Semantic convention for HTTP start / end times #591

toumorokoshi · 2020-05-08T05:04:30Z

It doesn't seem that there's a defined semantic convention for when an http instrumentation span should start and end. There's some ambiguity that would be worth clarifying, namely whether these measurements should also consider the duration of additional middleware layers.

As a start, I would argue we should declare that instrumentations should consider the full request/response cycle of a web application (include middlewares).

Reference for discussion origination: open-telemetry/opentelemetry-python#659

This is a more specific discussion of #330.

bogdandrutu · 2020-05-08T14:05:25Z

I am not sure if these details should be under "semantic conventions" or other section, but I do agree that they are important to be documented.

toumorokoshi · 2020-05-08T18:58:14Z

Any suggestions on a new location? otherwise I can just fire a PR.

toumorokoshi · 2020-05-09T04:04:31Z

Actually, re-thinking through it, "semantic_conventions" may still the be right location, since it's describing what the meanings of specific values should be. I believe it could expand to things like timings, since that ascribes meaning to the number.

Oberon00 · 2020-05-09T08:28:13Z

I feel like much of this could be generalized and covered in the docs for span's start/end.

yurishkuro · 2020-05-10T00:21:07Z

@Oberon00 +1. Why would we want to call this out just for HTTP spans? It is a general principle - if span represents an operation (be it an RPC or an internal function), its timestamps should be as close to operation start and end as possible.

toumorokoshi · 2020-05-10T03:41:26Z

I've updated the PR to incorporate an example. re-reading the API spec, I do believe that the existing spec is fairly clear on the measurement:

The Span's start and end timestamps reflect the elapsed real time of the
operation.

But I think the example further clarifies around nuances like:

whether data transfer time is considered in the span
whether things like http middleware and web framework overhead are considered as part of the timing.

yurishkuro · 2020-05-18T18:30:03Z

I would argue we should declare that instrumentations should consider the full request/response cycle of a web application (include middlewares).

I am actually neutral on this as a requirement, I think "it depends". Especially because tracing itself can be implemented as a middleware and the author of that middleware may not have much control over whether it's mounted as the outer-most mw/interceptor, or something in the middle of the MW stack.

pauldraper · 2020-05-21T00:09:20Z

How people think about layering OT/OTel has always been confusing to me. These are the events:

tcp.client.start
tcp.server.start
ssl.client.start
ssl.server.start
http.client.start
http.server.start
servlet.start
servlet.end
http.server.end
http.client.end
ssl.server.stop
ssl.client.stop
tcp.server.stop
tcp.client.stop

Personally, I think it makes sense to arrange them as:

http.client:
  http.server:
    servlet:

...and throw on as many layers to that as you want, or ignore them.

But don't try to squish or rearrange them in weird ways.

toumorokoshi · 2020-05-21T03:49:09Z

I would argue we should declare that instrumentations should consider the full request/response cycle of a web application (include middlewares).

I am actually neutral on this as a requirement, I think "it depends". Especially because tracing itself can be implemented as a middleware and the author of that middleware may not have much control over whether it's mounted as the outer-most mw/interceptor, or something in the middle of the MW stack.

I think it's good to provide some level of guidance or best practice. To your point there are situations where what is desired is not entirely possible (such as control over the middleware layering order), but at the same time there are expectations that if a timing names itself the http method and the route, it will encompass all time spent on that operation.

How people think about layering OT/OTel has always been confusing to me. These are the events:

@pauldraper Sorry, I'm not clear on how to incorporate this into the issue. Can you give an example of how one is squishing or re-arranging the request/response stack you posted, and how we can modify the specification to ensure that doesn't happen?

pauldraper · 2020-05-21T09:39:44Z

@toumorokoshi sorry there was some stuff that leaked in from #526 that didn't belong. On topic....

There should be a client span from before starting sending request to after receiving response.
There should be a server span from after starting receiving request to after sending response.
There should be other spans for any internal non-HTTP operations (middleware, main handler, whatever).

toumorokoshi · 2020-05-23T17:40:01Z

sounds good. I believe this does cover an example in the api/trace.md that addresses your concerns, albeit a little less explicitly:

#592

One challenge I'm having with the PR is the desire to make this a more high-level explanation, but the immediate need to clarify things like what a top level server span should or shouldn't encompass for http.

pauldraper · 2020-06-01T21:25:27Z

My opinion is that the client and server HTTP spans should contain just HTTP information: headers, statuses, agents, etc.

Everything else, e.g. invoking a Java servlet handler if the URL matches a known pattern, or running middleware, should be a new child span.

toumorokoshi mentioned this issue May 9, 2020

Clarify semantic conventions around span start and end time #592

Merged

arminru added this to the v0.5 milestone May 12, 2020

andrewhsu mentioned this issue May 12, 2020

Semantics of active Span (Latency vs. active duration, CPU times, ...) #330

Open

arminru assigned toumorokoshi May 12, 2020

carlosalberto modified the milestones: v0.5, v0.6 Jun 9, 2020

arminru added the area:semantic-conventions Related to semantic conventions label Jun 9, 2020

bogdandrutu added the spec:trace Related to the specification/trace directory label Jun 12, 2020

trask mentioned this issue Jun 26, 2020

Propagate context open-telemetry/opentelemetry-java-instrumentation#572

Merged

bogdandrutu closed this as completed in #592 Jun 30, 2020

ebrake mentioned this issue Sep 25, 2020

Django: Prepend OpenTelemetry middleware instead of append open-telemetry/opentelemetry-python#1163

Merged

1 task

siminn-arnorgj mentioned this issue Sep 20, 2023

Exclude background task execution from root server span in ASGI middleware open-telemetry/opentelemetry-python-contrib#1952

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic convention for HTTP start / end times #591

Semantic convention for HTTP start / end times #591

toumorokoshi commented May 8, 2020

bogdandrutu commented May 8, 2020

toumorokoshi commented May 8, 2020

toumorokoshi commented May 9, 2020

Oberon00 commented May 9, 2020 •

edited

Loading

yurishkuro commented May 10, 2020

toumorokoshi commented May 10, 2020

yurishkuro commented May 18, 2020

pauldraper commented May 21, 2020 •

edited

Loading

toumorokoshi commented May 21, 2020

pauldraper commented May 21, 2020

toumorokoshi commented May 23, 2020

pauldraper commented Jun 1, 2020

Semantic convention for HTTP start / end times #591

Semantic convention for HTTP start / end times #591

Comments

toumorokoshi commented May 8, 2020

bogdandrutu commented May 8, 2020

toumorokoshi commented May 8, 2020

toumorokoshi commented May 9, 2020

Oberon00 commented May 9, 2020 • edited Loading

yurishkuro commented May 10, 2020

toumorokoshi commented May 10, 2020

yurishkuro commented May 18, 2020

pauldraper commented May 21, 2020 • edited Loading

toumorokoshi commented May 21, 2020

pauldraper commented May 21, 2020

toumorokoshi commented May 23, 2020

pauldraper commented Jun 1, 2020

Oberon00 commented May 9, 2020 •

edited

Loading

pauldraper commented May 21, 2020 •

edited

Loading