Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Add details about Sampler interface and sampling decision. #186

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion trace/Sampling.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,31 @@ exporters.

The Sampling bit is always set only at the start of a Span, using a `Sampler`

### What kind of samplers does OpenCensus support?
### Sampler interface
A `Sampler` is an interface that returns a `SamplingDecision` based on given sampling parameters.

#### Sampling parameters
Based on the language the sampling parameters can be embedded into an object called
`SamplingParameters` or explicitly passed as arguments. The set of sampling parameters SHOULD
include at least:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the mechanism for sampler to set a sampling flag for downstream components? Or sampling score be added in tracestate? Or who's responsibility is it? Should propagator deal with this? Worth mentioning here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is a race condition here, what is the order of these events:

  • Modify the Tracestate
  • Call sampler.

If we call sampler first, we cannot use the current Tracestate (only parent Tracestate) in the decision. If we create the Tracestate first we cannot modify it after the Sampler is called (or I guess we can create it again).

What is your thought on the order of these two events?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good document explaining ordering of calls? @lmolkova, you had a picture I remember, can you send it to me pls?

Sampler may take tracestate object as an argument and modify when wants to.

One may also need to read and modify tracestate before span is created. See #131. This makes matter even more complicated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmolkova any update on this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any pictures about it.

My take on this:

tracing system should have a chance to modify tracestate before sampler makes a decision to support scenarios where sampling is controlled with tracestate (mentioned by Sergey).

I don't see an obvious scenario when tracestate is modified after sampling decision is made or span is started. Do we really want to support this?
Tracing system is the only thing that should touch tracestate and it should have just one opportunity to do so: after span is created, so I don't see how we can have a race condition here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think what I called "race condition" is not the best term. Here are the sequence of events that I can see in time order:

  1. Tracestate is modified before the sampler gets called as you suggested.
  2. Sampler gets called.
  3. Tracestate is modified with sampling results - I heard from @SergeyKanzhelev that here some systems may want to modify the trace state with results from the sampling

Do I miss something?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also often accidentally say "race condition" when mean "chicken egg"

Though I wonder if tracestate needs to be exposed in-process or not. For example, if it is only expected to be written on remote links, is it necessary for us to write our own here? For example, it seems necessary to write upstream, but possibly not to write our own trace state until the edge where it is exported to headers..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKanzhelev could you please elaborate on the scenarios where sampler needs to update tracestate after sampling decision? (once you are back from vacation)

what's the mechanism for sampler to set a sampling flag for downstream components? Or sampling score be added in tracestate?

Is sampling score a result of sampling decision or parameter sampler takes or both? This is a chichen-egg problem for sure.

@adriancole one of scenarios for tracestate requires in-process exposure and modification. Some systems want to keep (and update) legacy ids at least for a while. assuming we can only access tracestate on the boundaries, all spans in the middle (if any) will not have legacy ids and it'll be hard to discover parents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmolkova @SergeyKanzhelev any conclusion on this? I would like to finish this document.

* The parent `SpanContext`.
* The current `TraceId`.
* The current `SpanId`.
* The current `Tracestate`.
* The current `Span`'s name.
* The current `Span`'s kind.

#### SamplingDecision
`SamplingDecision` is the object returned by the `Sampler` and SHOULD include:
* A boolean value that represents the sampling decision.
* A list of `Attribute`'s that are added to the newly created `Span`.
bogdandrutu marked this conversation as resolved.
Show resolved Hide resolved

The list of returned `Attribute`'s MAY include:
* An `Attribute` with a key `sampler.type` to record the type of `Sampler` used.
* An `Attribute` with a key `sampler.param` to record the `Sampler` parameters (e.g. probability).
It can be ignored for `AlwaysSample` or `NeverSample`.

### What type of samplers does OpenCensus support?
* `AlwaysSample` - sampler that makes a "yes" decision every time.
* `NeverSample` - sampler that makes a "no" decision every time.
* `Probability` - sampler that tries to uniformly sample traces with a given probability. When
Expand Down