Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Context to Record #201

Closed
wants to merge 5 commits into from
Closed

Add Context to Record #201

wants to merge 5 commits into from

Conversation

yianni
Copy link
Contributor

@yianni yianni commented Sep 12, 2022

In this pull request, we add a Context to the Record type. Adding the context provides the ability to enrich the record with data. Our use-case is observing requests/events as they propagate through Kafka and the rest our distributed environment, aka the "distributed tracing" pattern[1].

The following snippet demonstrates how an upstream process propagates its context to franz-go. In this example, we would like to start tracing HTTP requests when using an HTTP server. The HTTP request context includes tracing data. This context is passed to the Kafka producer. We made an update to produce that checks for context and enriches the Kafka record. From there, a hook can reference the context and continue tracing the request.

func fooHandler(w http.ResponseWriter, req *http.Request, cl *kgo.Client) {
  ctx := req.Context() // Context here has tracing data
  span := trace.SpanFromContext(ctx)
  span.SetName("fooHandler")
  defer span.End()

  record := &kgo.Record{Topic: "my-topic", Value: []byte("foo")}
  var wg sync.WaitGroup
  wg.Add(1)

  cl.Produce(ctx, record, func(_ *kgo.Record, err error) {
   defer wg.Done()
   if err != nil {
     fmt.Printf("record had a produce error: %v\n", err)
     span := trace.SpanFromContext(ctx)
     span.SetStatus(codes.Error, err.Error())
     span.RecordError(err)
     ... 
   }
  })

  wg.Wait()

 ...
}

Other libraries use the pattern of adding a context to entities such as HTTP[2]. Having the context on the record allows for instrumenting code via existing hooks. We demonstrate a POC of instrumentation in PR#200[3].

Links:
[1] https://microservices.io/patterns/observability/distributed-tracing.html
[2] https://cs.opensource.google/go/go/+/refs/tags/go1.19.1:src/net/http/request.go;l=355-363
[3] #200

CC: @tobiasbrodersen @brunsgaard

@twmb, what are your thoughts on this proposal?

@yianni yianni changed the title Produce record context Add Context to Record Sep 12, 2022
@@ -139,6 +146,25 @@ type Record struct {
Offset int64
}

// WithContext enriches the Record with a Context.
func (r *Record) WithContext(ctx context.Context) *Record {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for WithContext? I think this would only be necessary for producing, which already currently accepts a context to both Produce and ProduceSync. It may be less confusing to set r.ctx = ctx inside of Produce, rather than having two areas for setting the context and people not knowing which one actually controls cancelling, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case is to enrich consumer records. In a consumer hook, we would extract trace data from the Kafka headers to a context and enrich the record with that context. This makes propagating trace data very convenient and save users the hassle of manual work. They would use the consumer record context to do further tracing in their consumer logic.

The consumer hook implementation would look like this:

func (k *Kotel) OnFetchRecordBuffered(r *kgo.Record) {
	textMapPropagator := propagation.NewCompositeTextMapPropagator(propagation.TraceContext{})
	ctx := textMapPropagator.Extract(context.Background(), NewRecordCarrier(r))
	r = r.WithContext(ctx)
        ...
}

Understandably, having two areas for setting the context is confusing. We are open to any suggestions you see necessary.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I was hoping that if I initialized the context internally, there would not need to be any public setter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could provide a mechanism for setting the Context, kgo.WithContextSetter(func(r *kgo.Record){} ctx.Context) or a hook with a context return value?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning to adding a public Context (or Ctx) field, rather than a getter / setter. If only read access were needed, a method would be better, but write access complicates things.

Copy link
Contributor

@brunsgaard brunsgaard Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twmb Okay, that works, I still think it is important that the context is being set on the record in the produce call. Otherwise it will be impossible to access the context(produce context) from the produce related hooks, and thus create and end spans are impossible.

Thanks for thinking about this, we understand that adding a field to the Record struct for the use case of OpenTelemetry is controversial, but I dont really think there is any other way if franz-go needs to support OpenTelemetry in a seamless manner.

@yianni
Copy link
Contributor Author

yianni commented Sep 23, 2022

Hello @twmb, We made some changes after hearing feedback. Thank you for your consideration and review of this pull request.

@yianni yianni marked this pull request as ready for review September 23, 2022 15:03
@brunsgaard
Copy link
Contributor

Great that you updated this, @twmb I hope this aligns with your expectation.

@@ -364,6 +364,8 @@ func (cl *Client) produce(
) {
if ctx == nil {
ctx = context.Background()
} else {
r.Context = ctx
Copy link
Owner

@twmb twmb Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to set the context unconditionally to whatever ctx, how about dropping the else? edit: pause on this, need to ask clarifying question. Not sure a record's context should be set here at all.

@twmb twmb mentioned this pull request Sep 27, 2022
8 tasks
twmb pushed a commit that referenced this pull request Oct 6, 2022
See #201 and #200.

Some packages want to stuff information into a context with WithValue,
and then that information can be used to trace spans (see the open
telemetry packages). Without a Context field or something equivalent, it
is not possible to trace spans once a record is handed off to franz-go.
We want to use these spans in franz-go hooks.

We do not need to do anything on the consumer side, but unfortunately,
we need the Context field to be **writable** so that consumers can
initialize the context (and stuff information into it) with WithValue.
So, we just add a new field to the record.
@twmb
Copy link
Owner

twmb commented Oct 6, 2022

I've squashed the four commits here into one, moved Context to the end of the Record struct, modified doc a tiny bit, and pushed this in 1a13af8

Thanks! Sorry for the delay on this one. Still trying to chase down if some things are bugs in this library that need to be addressed right now or not (I don't think so, thankfully)

@twmb twmb closed this Oct 6, 2022
@ssorren
Copy link

ssorren commented Oct 14, 2022

TBH, this is a bit confusing. If we want to enrich the Record with data, why wouldn't we use something more generic like UserData any as opposed to a Context? Context implies some sort of flow control. Calls to Produce already accept a Context, so it's a bit confusing. What happens if Record.Context is canceled? Will the record still produce?

Not to mention, Contexts are stacks, and I can not set Record.Context in a thread safe way since adding a value would look like this:

record.Context = context.WithValue(record.Context, k, v)

How would I lock the Record to ensure no one else is accessing record.Context? I'd have to wrap Record in another struct which has a lock. If I'm doing that, I don't need a Context on the record at all, I can just put this Context in the wrapper.

Seems to me, this could have been solved by the requester with a simple type like:

type RecordCtx struct {
  record *kgo.Record
  ctx context.Context
}

// safely set context values
func (rc RecordCtx) WithValue(k,v interface{}) RecordCtx {
  return RecordCtx{
    record: rc.record,
    ctx: context.WithValue(rc.ctx, k, v),
  }
}

It's already in now. oh well.

@twmb
Copy link
Owner

twmb commented Oct 17, 2022

why wouldn't we use something more generic like UserData any

Agreed, I had thought about this, but requiring type assertions to use what's in the arbitrary container would be a bit ugly. I know the same thing happens when people use WithValue and Value, but that's somewhat expected per context API -- the value API can provide somewhat the same thing that an arbitrary interface could. An arbitrary interface is more flexible, but I somewhat hope that people aren't really stuffing arbitrary data with a record anyway for the most part ... this is just a workaround for the now ecosystem standard of using contexts, and then many APIs using them when a map[key]value would do just fine.

I can not set Record.Context in a thread safe

I think the intent here is that you, as owner creating the record, set the context once in the same way you set a key / value once.

solved by the requester

That was my original hope too, but the problem comes from the consumer half the API if you want to set the record's context as early as possible (in a hook). If you want to start tracking information ASAP in OnRecordBuffered, the only way to do so is to provide a new field that can store information. After a lot of back and forth on different designs, an exposed Context field seems the most practical to address hopefully all problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants