Skip to content

Latest commit

 

History

History
61 lines (44 loc) · 4.39 KB

README.md

File metadata and controls

61 lines (44 loc) · 4.39 KB

Open Tracing Diagram

Use traces to describe micro-services architecture.

The problem

Microservices architectures are complex, it's never easy to write and maintain documentation around it. Add asynchronous messages heavily used in reactive applications, it becomes almost unrealistic to describe how such a system is working. It's a challenge even in monolith system. So, how to help a developer improving/fixing a system where there is nothing to guide expect the system itself ?

The idea

With initiatives such as OpenTracing or OpenCensus, traces are easy to acquire in the micro services world (pending each component is compatible with those frameworks). So what about using traces in a bottom-up approach to infer the architecture ? However, unless we know exactly what we are looking for in large amount of traces, it's quite hard to extract the right information: too much traces kill the traces. We weed something on top to make some sense of them.

The beginning of a solution

This project is a PoC that goes toward that goal. It builds a sequence diagram from jaeger traces and display it in a Grafana diagram panel by leveraging mermaid to render the diagram. It's not much but it can be useful. Disclaimer: the initial idea is from: https://danlebrero.com/2017/04/06/documenting-your-architecture-wireshark-plantuml-and-a-repl/

The Implementation

The transformation is done in Clojure. One file handler.clj of few hundreds lines, Swagger UI and Grafana source API included. Indeed, for Grafana to get access to the traces metadata, such as service name..., an API compatible with Grafana simple json datasource is needed.

The traces are generated by an extension of akka.net (akka.net is an actor framework for .NET derived from akka in java) that allows to easily trace all messages exchanged by actors: akka.opentracing. The example is generating fake requests every 5 seconds that trigger jobs, sub processes, storage operations... No need to know the details, the idea is to discover and understand it from the traces.

How to run it

git clone https://github.com/alexvaut/OpenTracingDiagram.git
cd OpenTracingDiagram
docker-compose up

Wait a bit for all the containers to be up and then:

  • Browse to http://localhost:16686 to see traces in jeager.
  • Browse to http://localhost:3000 to see the service transformation API in a Swagger UI (allow a couple of seconds for the Clojure Ring server to start).
  • Browse to http://localhost:3001/d/mddcLWmWk/sequence-diagram-from-traces to see the grafana dashboard where sequence diagrams are displayed. Click on refresh (top-right of the grafana screen) to display a new sequence diagram: jaeger is returning random traces when filtering them, by default only one trace is displayed, you can increase this limit on the dashboard, it's a grafana variable.

Screenshots

A sequence diagram rendered in Grafana Capture1

The information recorded for each span CaptureDetails

One trace to one diagram

One trace made of several spans in jaeger CaptureJ

Same trace rendered as a sequence diagram Capture2

Demo

From docker-compose to a diagram sequence in grafana Demo

Future

Many ideas can emerge from this work, some:

  • Provide more inputs (like jaeger UI) and link the 2 UIs.
  • Improve diagrams (mermaidjs is quite limited).
  • Cluster traces to extract the most common sequence diagrams.
  • Build dependency diagrams between components (kind of available in Jaeger already).
  • Use metrics (from prometheus for instance) on components to (this is where the merge of OpenCensus and OpenTracing should help):
    • Focus the architecture description on heavily used components, long processing...
    • Color/Format messages, components.