Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not filter by “Duration (μs)> =” in the UI #277

Closed
codefromthecrypt opened this issue Nov 23, 2016 · 13 comments
Closed

Can not filter by “Duration (μs)> =” in the UI #277

codefromthecrypt opened this issue Nov 23, 2016 · 13 comments

Comments

@codefromthecrypt
Copy link
Member

From @longinhit on October 24, 2016 9:50

Copied from original issue: openzipkin/zipkin#1349

@codefromthecrypt
Copy link
Member Author

the duration query depends a lot on how your apps are instrumented and
which storage is in use. For example, we no longer support duration query
on the "cassandra" storage type.

@codefromthecrypt
Copy link
Member Author

From @michaelsembwever on October 26, 2016 6:42

@longinhit it's supported in the cassandra3 storage, if you're able to upgrade.

@codefromthecrypt
Copy link
Member Author

From @dangets on November 21, 2016 17:17

Is this the same issue with the Elasticsearch backend? Results that shouldn't be are getting filtered out.

zipkin 1
zipkin 2

@codefromthecrypt
Copy link
Member Author

what instrumentation are you using? are the traces you are sending
reporting span.duration? You can check by using the api as
/api/v1/trace/trace-id-in-hex?raw

@codefromthecrypt
Copy link
Member Author

fyi here's a status report on libraries that properly address
timestamp and duration
openzipkin/openzipkin.github.io#49

@codefromthecrypt
Copy link
Member Author

From @dangets on November 22, 2016 16:29

We are using brave v3.15.1. We were previously using v3.9.1 until a few weeks ago and that is around when we started seeing the problem (upgraded brave and zipkin-server jar).

We do use a custom jmeter setup to run automated tests and we generate a traceId before sending the initial request so that we can log traceIds alongside request logs. Using the raw endpoint I am seeing sr and ss on the top-level span but it looks like the duration field is missing from that span.

Again, this was working previously - is this something that has changed during the deprecation of the SpanCollector classes? I can try switching to the Reporter classes to see if that fixes it.

@codefromthecrypt
Copy link
Member Author

From @dangets on November 22, 2016 19:12

I verified that when I hit an endpoint using curl and specifying a traceId in the headers the duration does not show up. Without the headers it does.

And we found this comment that says a timestamp/duration won't be kept if a traceId is sent - perhaps the cause?

// In the RPC span model, the client owns the timestamp and duration of the span. If we

Does a backfilling as mentioned in the comment ever happen?

@codefromthecrypt
Copy link
Member Author

I'll look into this now. It it is a root span, we should be setting
duration for sure. Stand by and thanks for gathering the details!

@codefromthecrypt
Copy link
Member Author

@dangets we are missing a feature, which is to accommodate externally provisioned root spans. The trick is to use a heuristic which doesn't accidentally double-report duration in the root span.

The only thing I can think of is to look for absence of the X-B3-Sampled header. When trace identifiers are externally generated, this header will be absent. When X-B3-Sampled: 1, we can assume the caller reported the span.

In other words, don't have jmeter set X-B3-Sampled: 1!

Change coming in a bit

@dangets
Copy link

dangets commented Nov 23, 2016

@adriancole Thanks! We can definitely do that.

codefromthecrypt pushed a commit that referenced this issue Nov 23, 2016
There are some who want to control trace identifiers, but leave the
sampling decision to the next hop. In this case, they propagate the
required headers (X-B3-TraceId, X-B3-ParentId) and not X-B3-Sampled.

When this happens, the receiver of the headers (the server), should
sample the trace ID before using it.

Before, we weren't implementing this (eventhough Finagle does). The
change here addresses it and also backfills tests accordingly.

See https://github.com/twitter/finagle/blob/develop/finagle-http/src/main/scala/com/twitter/finagle/http/Codec.scala#L341
See https://github.com/twitter/finagle/blob/develop/finagle-core/src/main/scala/com/twitter/finagle/tracing/Tracer.scala#L119
See openzipkin/b3-propagation#8

Fixes #277
@codefromthecrypt
Copy link
Member Author

#278

@codefromthecrypt
Copy link
Member Author

going out now as 3.15.4

lemme know if it doesn't work!

@dangets
Copy link

dangets commented Nov 28, 2016

Just tested it, looks good. Duration on root trace is showing up and filtering in the UI works as expected. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants