Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipkin ui very slow when has a lot of span #1460

Closed
dragontree101 opened this issue Dec 28, 2016 · 15 comments
Closed

zipkin ui very slow when has a lot of span #1460

dragontree101 opened this issue Dec 28, 2016 · 15 comments

Comments

@dragontree101
Copy link

we are trace is very big, and have a lot of spans,
so zipkin's ui very slow when i choose span name or into some big trace.

i use zipkin 1.16 and use es store?

does have any way to speed ui's performance?

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Dec 28, 2016 via email

@dragontree101
Copy link
Author

dragontree101 commented Dec 29, 2016

sorry i am not describe clearly
one of a trace is
image

and chrome debug, i found get data from zipkin-server is fast
image

and from chrome timeline i found scripting is slow, a lot of time use to execute this script
image

i guess exec jquery js is slow, i don't know this could be optimize?

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Dec 29, 2016

yeah I think this could be optimized. @cburroughs are you interested in cracking knuckles on some performance work?

@cburroughs
Copy link
Contributor

Could you give some details about the particular trace that is slow as a point of reference? How many spans does it have? Or how many KiB is the json?

@codefromthecrypt
Copy link
Member

@cburroughs based on the screen shot.. looks like roughly 10k spans at a maximum depth of 7. dunno about the size of the json @dragontree101 ?

@ghost
Copy link

ghost commented Jun 20, 2017

@adriancole we've also experienced issues where by a trace with over 3,000 spans ended up crashing the chrome instance we were using to view the zipkin UI. The UI was also pretty non perform-ant prior to the crash.

@codefromthecrypt
Copy link
Member

The "10k span problem" was discussed at the most recent tracing workshop. We are not alone, for example even Google's UI doesn't render with thousands of spans per trace.

Here are some options

  • aggregating the spans and putting in timing, possibly on the collection side.
    • basically adding a tag that says spans were dropped
  • drop span and replace with annotation
    • have the UI code convert large amounts of spans into annotations before they are rendered
    • note large orders of annotations could also crash the UI so should be tested
  • add a control to collapse parts of a trace that are too large
    • leave the span data alone, but collapse spans which have over a certain amount of children
  • add a dependency graph per trace
    • when there are a lot of spans, a trace ID scoped dependency graph could be useful

@Logic-32
Copy link
Contributor

Logic-32 commented Jan 9, 2018

Have you considered a slightly more optimized rendering technique? I haven't profiled things myself but the places I'd start poking around are:

  1. Mustache templates; are they slow to compile/render at that scale?
  2. Lazy render; only add to the DOM what is visible and potentially start with things collapsed after a certain threshold.

Our use case, btw, is that we have a fairly involved data gathering/generation system that sometimes has to do a deep dive through various web services to get all of it's data. We try to identify repeated requests and consolidate them into bulk requests (in code, not zipkin) to alleviate this issue but we still have it on occasion.

One of our biggest improvements in this area was to break up traces so that the typical "large traces" now link to "child traces" so that you can see at the high level how long things took or navigate off to another trace to see detail for a particular section.

@yurishkuro
Copy link
Contributor

Tip: we were able to scale Jaeger UI to 50k spans by using a viewport technique.

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jan 10, 2018 via email

@igorwwwwwwwwwwwwwwwwwwww
Copy link
Contributor

In order to reproduce the issue it would be helpful to get a gist with a JSON blob of the trace, that can be POSTed against zipkin.

@codefromthecrypt
Copy link
Member

@igorwwwwwwwwwwwwwwwwwwww don't have a prebaked utility to make large traces, but this lua script could probably be modified to do so.. #1226 (comment)

@Logic-32
Copy link
Contributor

Logic-32 commented Jan 16, 2018

@igorwwwwwwwwwwwwwwwwwwww, I have a trace with 9k spans in it that can probably be used for your purposes (with the aid of #1884). Is there a "private" way I can send it to you? Nothing super-confidential in it but enough I don't want it to be public. Or did @adriancole's lua script fit your need?

Edit: sent the trace :)

@codefromthecrypt
Copy link
Member

not sure this will change anything but next zipkin release will update jquery lib and we should retry #1954

@jorgheymans
Copy link
Contributor

This issue relates to the classic Zipkin UI. Lens has seen many performance improvements since then. Having testing with 10k span traces myself, i see these loading in just in 7-8 seconds on my laptop. UX wise this can still be improved sure, but it's definitely a major improvement over classic UI.

If you're considering posting similar issues against Lens, please provide us with an anonymized version of your slow loading trace so we can benchmark and profile it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants