Skip to content

Zipkin 2.21

Compare
Choose a tag to compare
@codefromthecrypt codefromthecrypt released this 16 Apr 10:28
· 584 commits to master since this release

Zipkin 2.21 Adds an archive trace feature to the UI and dynamic Elasticsearch credentials support

Archive Trace

Sites like Yelp have multiple tiers of storage. One tier includes 100% traces at a short (an hour or less) TTL. Later are normal sampling, either via a rate limit or low percentage. Some sites even have a tier that never expires. In such setups, there is tension. For example, the 100% tier is great for customer support as they can always see traces. However, the traces can expire in the middle of the incident! who wants a dead link?! Even normal trace storage is typically bounded by 3-7 day TTL. This becomes troublesome in issue trackers like Jira as the link is again, not permanent.

This problem is not as simple as it sounds. For example, it might seem correct to add tiering server-side. However, not all storage even support a TTL concept! Plus, multi-tiered storage is significantly more complex to reason with at the code abstraction vs the HTTP abstraction. @drolando from Yelp detailed these pros and cons, as well false starts here.

The ultimate solution by Daniele, and adopted by the Zipkin team in general, was to leverage HTTP as opposed to bind to specific implementation code. In other words, everything is handled in the browser. This allows the archive target to not only be Zipkin, but anything that speaks its format including clones and vendors.

You can now set the following properties to add an "Archive Trace" button"

ZIPKIN_UI_ARCHIVE_POST_URL=https://longterm/api/v2/spans
ZIPKIN_UI_ARCHIVE_URL=https://longterm/zipkin/trace/{traceId}

These are intentionally different as some vendors have a different read back expression than zipkin, even when their POST endpoint is the same.

When you click archive, you'll see a message like below confirming where the trace went, in case you want to copy/paste it into your favorite issue tracker.

archive

Many thanks to @drolando for making this feature a reality, as well @anuraaga @jcchavezs @jeqo @shakuzen and @tacigar who gave feedback that led to the polished end state.

Dynamic Elasticsearch Credentials

Our Elasticsearch storage component has supported properties based changes to credentials since inception. While few knew about it, this didn't imply env variables, as file based always worked. Even with this flexibility, those running a non-buffered transport (like Kafka) would drop data when there's a password change event. This is incidentally routinely the case as HTTP and gRPC are quite common span transports. @basvanbeek noticed that even though the server restarts quickly (seconds or less), this drop is a bad experience. It is becoming commonplace for file-based updates to push out through services like Vault, and the restart gap could be tightened.

@hanahmily stepped in to make reading passwords more dynamic, something pretty incredible to do for a first-time committer! At the end of 2 weeks of revisions, Gao landed a very simple option that covers the requirements, closing the auth-fail gap to under a second.

  • ES_CREDENTIALS_FILE: An absolute path of the file.
  • ES_CREDENTIALS_REFRESH_INTERVAL: Refresh interval in seconds.

Thanks very much to @anuraaga and @jorgheymans for all the feedback and teamwork to get this together.

Minor changes

  • @jorgheymans added instructions for running lens behind a reverse proxy
  • @tacigar @anuraaga @drolando fixed a few UI glitches
  • ES_ENSURE_TEMPLATES=false is now available for sites who manage Elasticsearch schema offline
  • ZIPKIN_UI_DEPENDENCY_ENABLED=false is now available for sites who will never run zipkin-dependencies