Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't report span.timestamp or duration when you didn't create the span #91

Open
codefromthecrypt opened this issue Nov 28, 2016 · 5 comments

Comments

@codefromthecrypt
Copy link
Member

Zipkin's RPC span is interesting because it is a shared model. For example, a client creates a span and then the next hop (a server) adds annotations to it. The authoritative owner of the span is the one that's supposed to set timestamp and duration. In the case where it is originated by a client, the client, not the server, should set timestamp and duration.

See openzipkin/openzipkin.github.io#49

Why bother?

If we special-case timestamp and duration reporting, a couple things happen:

  • zipkin duration queries will be accurate because the critical path duration will be the only duration queried for the span
  • Those externally controlling ids will be able to use the duration query
  • Those converting zipkin RPC spans into single-host spans, like Google StackDriver, will have a heuristic to tell if a span is dual host or not.

Do we need to do anything?

I'm not sure if the current implementation is doing this fine or not, but it is probably useful to add tests to make sure it it. Usually, there's very little to do to special-case this, and it always is where B3 headers are read.

Here are tests that should smoke it out

  • client creates a span and propagates it to a server

    • client propagates b3 headers for trace id, span id, and sampled=1
    • server neither reports span.timestamp nor duration
    • client reports span.timestamp and duration
  • server creates a new trace

    • no b3 headers are detected
    • server reports span.timestamp and duration
  • server creates a new trace with externally supplied ids

    • caller propagates b3 headers for trace id, span id, but no sampled header
    • server reports span.timestamp and duration

Does this make sense? Can anyone help translate this into tests for us?

See openzipkin/brave#277 (comment)

@codefromthecrypt
Copy link
Member Author

ps mostly unrelated, but in some other implementations, when we start a span, we log a tick (or otherwise start a timer). When this is set (ex it wouldn't be set if it were a continued server span), we extract a measurement in microseconds as duration. Using this is more reliable than system time, as it never risks moving backwards during a span (ex from NTP adjustments). Not sure best mechanism in ruby, but I've found http://ruby-doc.org/stdlib-2.3.3/libdoc/benchmark/rdoc/Benchmark.html and https://github.com/copiousfreetime/hitimes

@jcarres-mdsol
Copy link
Collaborator

jcarres-mdsol commented Nov 28, 2016

I think the third one would not pass and require several changes.
The first two it are easy, you got the ids , someone else is owning this, else you own this.
The third is weird, "the lack of this optional sampled header in fact means you own this", weird. Current code will not like this

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Nov 29, 2016 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Nov 29, 2016 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Nov 29, 2016 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants