[APM] Time shifting in DT #24072

sorenlouv · 2018-10-16T11:20:17Z

Clock skew is a phenomenon that happens when servers in a distributed system do not follow the same clock time. This is a problem for distributed tracing, since the timestamp is critical in visualizing the relationships between transactions and spans across services.

Minimizing clock skew
Time for RUM agent events is inherently skewed, since the timestamp is recorded after the fact by the apm-server.

Clock skew for other agents can be mitigated by something like NTP but not eliminated. According to one source a default NTP setup will poll the NTP server in an interval between 64 and 1024 seconds. NTP is not perfect and can have problems on its own but it is probably good enough in most cases.

Question:

Should agents and apm-server detect and warn if NTP is not enabled?
Should agents and apm-server make a sanity time check against [some magic server]?

Proposed solution
In the event of clock skew, there is not much the agents or apm-server can do. Instead I propose that the UI should ensure that the positioning of events at the very least doesn't conflict with the specified parent-child relationship. Meaning: an event should never start before the parent that initiated it did.

In the following example we have two services: opbeans-node and opbeans-node-api with three events:

opbeans-node: initiates the trace (transaction)
opbeans node: makes an outgoing request to opbeans-node-api (span)
opbeans-node-api: receives the request (transaction)

In the following example, the clock of opbeans-node-api is ahead of the clock in opbeans-node, which causes the transaction in opbeans-node-api to start before the request from opbeans-node has been made:

If we assume zero latency between the two services, we can adjust the mis-aligned transaction to start at the same time as its parent span:

Above is a very simple example, and a big trace might require all downstream children to be re-adjusted when their parent is adjusted. There are probably a lot of gotchas I haven't thought about, but I think the UI needs to solve these issues regardless.

elasticmachine · 2018-10-16T11:20:18Z

Pinging @elastic/apm-ui

sorenlouv · 2018-10-25T12:48:05Z

Original (correct) timeline:

A span is shifted to the right to simulate clock skew:

To correct clock skew affected child spans are shifted to the rightL

Issues:

Relative distance between child spans is not preserved
The simple time shifting doesn't take end-duration of the parent into account (currently childen are shifted outside the end of the transaction)

sorenlouv added Team:APM All issues that need APM UI Team support [zube]: Inbox labels Oct 16, 2018

sorenlouv added [zube]: Impl Ready and removed [zube]: Inbox labels Oct 16, 2018

roncohen added the v6.5.0 label Oct 23, 2018

alvarolobato assigned sorenlouv Oct 25, 2018

alvarolobato added [zube]: In Progress and removed [zube]: Impl Ready labels Oct 25, 2018

alvarolobato added [zube]: Impl Ready and removed [zube]: In Progress labels Oct 26, 2018

alvarolobato mentioned this issue Oct 30, 2018

[APM] Timeline items are out of order #24591

Closed

sorenlouv closed this as completed Nov 1, 2018

sorenlouv added [zube]: Done and removed [zube]: In Review labels Nov 1, 2018

sorenlouv mentioned this issue Nov 1, 2018

[APM] Clock skew fix sorenlouv/kibana#3

Merged

alvarolobato removed the [zube]: Done label Nov 7, 2018

sorenlouv mentioned this issue Apr 30, 2019

[APM] Relax time-shifting so it doesn't affect async transactions #35800

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APM] Time shifting in DT #24072

[APM] Time shifting in DT #24072

sorenlouv commented Oct 16, 2018 •

edited

Loading

elasticmachine commented Oct 16, 2018

sorenlouv commented Oct 25, 2018

[APM] Time shifting in DT #24072

[APM] Time shifting in DT #24072

Comments

sorenlouv commented Oct 16, 2018 • edited Loading

elasticmachine commented Oct 16, 2018

sorenlouv commented Oct 25, 2018

sorenlouv commented Oct 16, 2018 •

edited

Loading