`Span::context()` should return value, not reference #74

isaachier · 2018-04-13T03:59:19Z

Jaeger tries to implement a multithreaded Span by keeping the SpanContext immutable (for the most part, details about sampling flag not relevant here). Whenever the baggage changes, the span swaps the current SpanContext instance with a new SpanContext object with an identical baggage map other than the appended key-value pair. The problem arises with the following code.

std::unique_ptr<opentracing::Span> span = tracer->StartSpan("operation");

// Thread #1
SpanContext ctx(span->context());

// Thread #2
span->SetBaggageItem("key", "value");

Here is the issue with this paradigm.

Thread #1: acquire lock on span
Thread #1: get reference to span's context
Thread #1: release lock on span
Thread #2: acquire lock on span
Thread #1: return reference to caller
Thread #2: swap span's context, overwriting initial context

The code above shows a potential interleaving of instructions in such a way that the caller of Span::context() is left with undefined behavior. There is no way for the caller to prevent the existing memory from being overwritten. I proposed changing the return type to a value so we can safely copy the existing span context into a new memory location in the caller's domain, eliminating all race conditions.

The text was updated successfully, but these errors were encountered:

rnburn · 2018-04-13T04:03:55Z

Why doesn't jaeger modify the existing span context when you set a baggage item?

isaachier · 2018-04-13T04:05:49Z

OK I see the LightStep C++ client uses a lock in the span context to avoid this (thanks @rnburn for reference here: https://github.com/lightstep/lightstep-tracer-cpp/blob/master/src/lightstep_span_context.h#L59). I wonder what @yurishkuro has to say about changing the C++ client to lock the baggage and maintaining a lock in the SpanContext instead of locking the Span alone.

isaachier · 2018-04-13T04:07:32Z

To be fair, the reason the C++ client is the way it is has to do with the Go client. The Go client does not have a concept of returning via reference, only pointer, and no one thought that was a good idea... so I'm wondering if we should follow the opentracing-go spec here.

yurishkuro · 2018-04-13T04:31:53Z

If the context is immutable it doesn't really matter if it's returned by value or by reference. In Go client the lock is held while reading the context (ref) from the span. I'm not seeing in your threading example what the problem is. If you have two threads doing read and write, there's no way to ensure which version the reader gets - it's a race in the usage, not in the implementation of the context () method. The potential memory leak is another story, but specific to C++.

isaachier · 2018-04-13T04:36:55Z

The problem runs a little deeper than that. The issue is potentially reading in the middle of the write to the same memory location. Replacing old context with new one is not an atomic operation. Caller may be handed a partially written location with some mixed state of old and new values.

isaachier · 2018-04-13T13:49:14Z

Even returning by value is impossible here given the slicing problem. Can we return a shared pointer @rnburn? It is probably a decent compromise for both of us even though I hate changing the standard.

rnburn · 2018-04-13T14:42:12Z

I'm still not clear why this is even a problem with the interface. What's the issue with implementing locking within the span context?

And if you want to support copying why not use this non-breaking approach #56?

isaachier · 2018-04-13T14:55:01Z

I'm waiting on @yurishkuro's opinion regarding changing the Jaeger implementation as that clearly resolves this too. About the shared pointer, that is more compatible with the current LightStep client than returning a copy would be. So I leave it up to you to decide if we can return by value instead.

yurishkuro · 2018-04-13T15:16:36Z

I don't think I understand the issue, still.

The issue is potentially reading in the middle of the write to the same memory location. Replacing old context with new one is not an atomic operation.

So we're not writing to the context since it's immutable. And replacing span.context field with a new instance is done inside the lock for the whole span. How is it not atomic?

isaachier · 2018-04-13T15:22:57Z

Wait a second, when I say immutable I just meant logically not practically. Welcome to the annoying world of C++ object assignment. I'll post the equivalent in Go code in a moment.

isaachier · 2018-04-13T15:37:43Z

type SpanContext struct {
    flags uint8
    baggage map[string]string
}

func (c *SpanContext) assign(newValue SpanContext) {
    c.flags = newValue.flags
    c.baggage = newValue.baggage
}

type Span struct {
    sync.Mutex
    // ...
    context SpanContext
}

func (s *Span) Context() *SpanContext {
    s.Lock()
    defer s.Unlock()
    return &s.context
}

func (s *Span) SetBaggageItem(k, v string) {
    s.Lock()
    defer s.Unlock()
    newBaggage := s.context.Baggage()
    newBaggage[k] = v
    context := s.context.WithBaggage(newBaggage)
    s.context.assign(context)
}

If the caller uses the pointer, he or she has no idea if the reassignment took place, did not take place, or is in the middle of taking place.

isaachier · 2018-04-13T15:45:47Z

I mean if we limit the code to the extent that you literally cannot do anything with it other that iterate over keys and values, as is the case now, we should be okay.

isaachier · 2018-04-13T15:52:59Z

Again, this will affect any code interested in doing IsSampled in Jaeger or whatever vendor-specific API call since the assignment is not atomic. Meaning, flags could be from a new value and baggage could be from an old one in the example above.

yurishkuro · 2018-04-13T16:01:36Z

func (c *SpanContext) assign(newValue SpanContext) {
    c.flags = newValue.flags
    c.baggage = newValue.baggage
}

this is not what the Go client does. Why can't the context be immutable? This ^ one clearly is not.

isaachier · 2018-04-13T16:08:21Z

Right but C++ assignment = really calls operator= which is exactly what assign does above. The reason for this is mostly the low level nature. C++ assignment works like this works in every other language:

int x = 1;
int y = 2;
x = y;  // Copy value of y to x in place.

isaachier · 2018-04-13T16:09:09Z

This can be solved with another lock in SpanContext but that sort of ruins the immutable aspect.

yurishkuro · 2018-04-13T16:19:13Z

once again, you returned a reference from context() and then you override its internal values in assign - it's not immutable context. Immutable context means you create a new instance in SetBaggage, or you return a copy from context().

rnburn · 2018-04-13T16:35:57Z

Having context() as a reference to the state of the span's context is really the most natural and efficient approach for C and C++. (The OpenTracing spec is suppose to be tweaked to match the target language, no?)

If you're going to change this for C++, then I expect you'd need to make a corresponding change for C. Code like this that you have in the README:

   span = tracer->start_span(tracer, "test-inject");
    return_code =
        tracer->inject_text_map(tracer,
                                (opentracing_text_map_writer*) &writer,
                                span->span_context(span));

would have to do something like this, right?

   span = tracer->start_span(tracer, "test-inject");
   opentracing_span_context* context = span->span_context(span); // Get a copy
    return_code =
        tracer->inject_text_map(tracer,
                                (opentracing_text_map_writer*) &writer,
                                context);
   free(context);

An approach like that that makes copies all the time when you want to reference the context is not going to be as performant. If you want an immutable copy, then let's make that explicit with something like a clone function for the span context (What's done in #56).

isaachier · 2018-04-13T16:43:59Z

Definitely! The copy semantics in C are even worse so I appreciate the reminder to change that as well. Unfortunately, true immutability means creating a new object in a new memory address every time you update the baggage. This is fundamentally difficult in C++ which limits dynamic memory. I imagine Rust will have an even harder time with this particular issue.

The copy helps but not unless the context method returns a copy. In that case, how would you handle that in LightStep? That goes back to the idea of returning a smart pointer (TBD std::unique_ptr or std::shared_ptr), if only because of the slicing problem. Inheritance hierarchies are always difficult in C++ when you want to return the superclass to conform with the API.

rnburn · 2018-04-13T16:48:02Z

Yes, but why do we need to interpret a SpanContext as immutable like that? If it's absolutely necessary to have that concept, then why don't we distinguish it with a type like ImmutableSpanContext?

isaachier · 2018-04-13T16:51:17Z

I think @yurishkuro has strong preference for immutability here. Reminds me of Clojure's persistent data structures. I find it elegant but difficult without causing a breaking change in the OT C++ spec.

isaachier · 2018-04-13T16:58:11Z

Even in Go I believe span.context = newValue can't be atomic unless span.context is itself a pointer.

yurishkuro · 2018-04-13T17:11:45Z

Even in Go I believe span.context = newValue can't be atomic unless span.context is itself a pointer.

It can be, under lock on the Span.

Anyway, the main reason for making the context immutable in Go was to avoid having to kep another lock in it, in addition to the lock on the span. I am not very happy with the read/write nature of baggage on the span because we already ran into logical race conditions with it. I.e. I would've preferred if the baggage could only be defined on span creation, making the span context trully immutable.

isaachier · 2018-04-13T17:36:27Z

I meant if someone has a pointer to the SpanContext and is not locking the Span when reading it, Go cannot guarantee the Span won't clobber the read by updating the context struct.

yurishkuro · 2018-04-13T18:09:55Z

Well out Go client simply doesn't do that - it returns the context by value, but even if we change it to return a pointer the write operation always replaces the context in the span, so the old pointer is unchanged.

isaachier · 2018-04-13T18:20:19Z

OK so I think it's probably better for me to lock the baggage to keep compatibility for any existing instrumentations. Unfortunately, SpanContext immutability would definitely break the code for any instrumented packages.

ringerc · 2018-08-31T05:40:32Z

My original thought when I brought this up was to make SpanContext copy-constructable via a factory interface exposed in the OpenTracing context, but this looks like it makes sense.

isaachier mentioned this issue Apr 13, 2018

Change Span::context() to return shared pointer #75

Closed

isaachier self-assigned this Apr 13, 2018

isaachier added the in progress label Apr 13, 2018

isaachier closed this as completed Apr 13, 2018

isaachier removed the in progress label Apr 13, 2018

This was referenced Apr 16, 2018

Add lock to SpanContext to protect baggage jaegertracing/jaeger-client-cpp#83

Merged

Clarify thread-safety of Span opentracing/specification#115

Open

ringerc mentioned this issue Aug 29, 2018

Offline builds (reproducibility, security) jaegertracing/jaeger-client-cpp#38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Span::context()` should return value, not reference #74

`Span::context()` should return value, not reference #74

isaachier commented Apr 13, 2018 •

edited

Loading

rnburn commented Apr 13, 2018 •

edited

Loading

isaachier commented Apr 13, 2018 •

edited

Loading

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

rnburn commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018 •

edited

Loading

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

rnburn commented Apr 13, 2018

isaachier commented Apr 13, 2018

rnburn commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

ringerc commented Aug 31, 2018

Span::context() should return value, not reference #74

Span::context() should return value, not reference #74

Comments

isaachier commented Apr 13, 2018 • edited Loading

rnburn commented Apr 13, 2018 • edited Loading

isaachier commented Apr 13, 2018 • edited Loading

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

rnburn commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018 • edited Loading

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

rnburn commented Apr 13, 2018

isaachier commented Apr 13, 2018

rnburn commented Apr 13, 2018

isaachier commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

yurishkuro commented Apr 13, 2018

isaachier commented Apr 13, 2018

ringerc commented Aug 31, 2018

`Span::context()` should return value, not reference #74

`Span::context()` should return value, not reference #74

isaachier commented Apr 13, 2018 •

edited

Loading

rnburn commented Apr 13, 2018 •

edited

Loading

isaachier commented Apr 13, 2018 •

edited

Loading

isaachier commented Apr 13, 2018 •

edited

Loading