-
Notifications
You must be signed in to change notification settings - Fork 126
cpp-client loses spans on short-lived processes; Close() not flushing buffers? #52
Comments
Related #53 |
|
The RemoteReporter buffers spans, but didn't flush them on close. So any spans Finish()ed between the last flush interval and the Close() of the Tracer would be lost. Fixes jaegertracing#52 Signed-off-by: Craig Ringer <[email protected]>
It probably makes sense to log dropped buffers on close instead, with a hint that an explicit flush before close is advised (using #53 for flush). But that'd really want the opentracing API to adopt the Flush method. |
The I think the right solution is for jaeger to move the current code in |
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. Signed-off-by: Craig Ringer <[email protected]>
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. Signed-off-by: Craig Ringer <[email protected]>
@rnburn Makes sense. I'll amend my push that adds an optional @isaachier I've written a small test program to reproduce this bug. It reliably loses spans. ringerc@2a36dd5 (on master) or ringerc@ff654d9 (on my fix/feature integration branch, but not using flush).
in |
Additional test runs show that without If I uncomment Anyway, current behaviour clearly broken. |
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. Signed-off-by: Craig Ringer <[email protected]>
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. This variant is enhanced with arguments to control flush mode and sleep after span finish. Signed-off-by: Craig Ringer <[email protected]>
I pushed an improved test case (ringerc@e5a6234). It clearly shows that we lose spans unless:
The flush must actually take effect. Sometimes it seems to get missed;
in which case the span is also lost. So there's something wrong with my flush method. Locking issues? Lowering the configured @isaachier I could use advice at this point, my C++ and threading is ... limited. I also have no idea how to integrate this into the regression tests. Crossdock somehow? |
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. This variant is enhanced with arguments to control flush mode and sleep after span finish. Signed-off-by: Craig Ringer <[email protected]>
Extend the OpenTracing API with an explicit jaegertracing::Tracer::flush() method to force spans to be flushed eagerly without closing the tracer. It returns only when the spans are flushed. Fixes jaegertracing#53 Call flush() from Close(), but not from the Tracer dtor. So we follow the spec and ensure we flush buffers on explicit Close only. Fixes jaegertracing#52 WIP. For as-yet-undiagnosed reasons `flush()` sometimes waits a full bufferFlushInterval before returning. Signed-off-by: Craig Ringer <[email protected]>
Extend the OpenTracing API with an explicit jaegertracing::Tracer::flush() method to force spans to be flushed eagerly without closing the tracer. It returns only when the spans are flushed. To support this a new condition variable is introduced in the reporter to allow the main thread to wait on notification from the reporter flush thread. Fixes jaegertracing#53 Call flush() from Close(), but not from the Tracer dtor. So we follow the spec and ensure we flush buffers on explicit Close only. Fixes jaegertracing#52 Signed-off-by: Craig Ringer <[email protected]>
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. This variant is enhanced with arguments to control flush mode and sleep after span finish. Signed-off-by: Craig Ringer <[email protected]>
In general there might be a problem with blocking IO and threading. I'm about to call it a night, but will try to look at this tomorrow. Worst case, will be able to look over the weekend. |
I think I have a simpler approach for this specific issue. About the flush method being exposed overall, that can be a new PR. See #59 for first fix. |
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. This variant is enhanced with arguments to control flush mode and sleep after span finish. Signed-off-by: Craig Ringer <[email protected]>
The RemoteReporter loses spans during short-lived executions (jaegertracing#52). This is a test case not a fix. This variant is on top of my integration branch for my feature branches. This variant is enhanced with arguments to control flush mode and sleep after span finish. Signed-off-by: Craig Ringer <[email protected]>
If a process:
the span should be reported to the collector. But it isn't unless a short sleep is inserted before the
tracer->Close()
call.If the
sleep()
is added after the tracer close, the spans are not sent. It looks like Close() must be failing to flush buffers, instead dropping any pending buffers on the floor.Configuration, loaded with the YAML config support:
The text was updated successfully, but these errors were encountered: