ref(sampling): Refactor dynamic sampling #2514

TBS1996 · 2023-09-14T04:30:26Z

Got rid of the old style of SamplingResult, now we save the entire sampling match along with a flag of whether it's kept or not.
SamplingResult now has a Pending variant which suggests that dynamic sampling has yet to be run. The alternative, Option, was too vague for my taste.
Introduced SamplingEvaluator, a state machine for matching sampling rules.
Removing a lot of responsibilities of the sampling crate, in order to make it more generic and reusable.
More general refactoring, including a lot of simplified API for functions.
Moved dynamic sampling tests from lib.rs to wherever the functionality they test resides.
Deleted many tests that relied on the old APIs, or tested responsibilities that the sampling crate no longer has.
Created many new tests.

#skip-changelog

1. We are refactoring dynamic sampling in #2514, making these tests outdated. 2. it references a dynamic sampling function id like to make private

relay-sampling/src/config.rs

relay-sampling/src/dsc.rs

relay-sampling/src/evaluation.rs

relay-sampling/src/lib.rs

TBS1996 · 2023-09-23T15:37:15Z

relay-server/src/actors/processor.rs

-
-        let service = create_test_processor(config);
-
+    #[test]


changing the signature of compute_sampling_decision made this test a lot simpler

relay-server/src/utils/dynamic_sampling.rs

relay-server/src/lib.rs

TBS1996 · 2023-09-24T09:16:09Z

relay-server/src/actors/processor.rs

+        if (sampling_config.is_none() || event.is_none())
+            && (root_sampling_config.is_none() || dsc.is_none())
+        {
+            return SamplingResult::NoMatch;
+        }


Early return.

I don't like that the type system doesn't represent how sampling_config and event is related. and how root_sampling_config and dsc is connected.

Perhaps some new struct which encapsulates both the configuration and rules, and the instance to match on. Would be a separate PR of course.

TBS1996 · 2023-09-24T09:17:48Z

relay-server/src/actors/processor.rs

-            state.envelope().dsc(),
-            state.event.value(),
-        );
+    fn compute_sampling_decision(


removed the &self because it was only used to check if processing is enabled, which also made it way easier to unit test.

Not sure if it should still be under EnvelopeProcessorService though, how about we move it to utils like the utils::get_sampling_result?

I'd suggest to revert this change for three reasons:

The parameter list of this function is long. That was also the reason why ProcessEnvelopeState was introduced in the first place, to pass processing information together.

We will soon need more fields from the state, such as access to Redis as well as the received timestamp (it is currently a bug that this function uses Utc::now()).

Associated functions like this one that neither work with the type that they are declared on, nor return it, are unidiomatic.

alright, i'll revert it, although personally I prefer it like this. My thinking was that we would clearly delineate the process_state function which runs entirely on side effects, and it would in turn call pure functions, meaning we would avoid passing in state any further from processing_state.

I'd say the long parameter list might tell us something about this function that simply passing in state would mask. Passing in state here feels a lot like using global variables.

in the function (as expressed by the first early return) you can see how 'event' and 'sampling_config' need each other, same with dsc and root_sampling_config. Perhaps we could explore a newtype that encapsulates the "Getter" and a sampling config in a later PR?

3: I agree its not idiomatic, so i think it would be good in utils of the same crate

ok so, reverting it leads to complicating many tests I wrote that relied on its functional approach, where you have to start the service, create a mock processenvelopestate and projectstate to check if a field on processenvelopestate has changed.

could you verify again if you think I should revert this? my suggestion would be to keep it functional, move it to utils, and let the long param list stand as a sign for us to think about refactoring in the future (rather than hiding the params in state).

Makes sense. Given this is an internal function of the processor, we can easily revisit this at a later point. There's limited exposure of this API. Let's move forward.

relay-server/src/utils/dynamic_sampling.rs

jjbayer

LGTM, though a refactor like this makes it hard to catch logical errors. I'll leave final review to @jan-auer.

relay-sampling/src/condition.rs

jjbayer · 2023-09-25T09:04:12Z

relay-sampling/src/evaluation.rs

+    /// Returns true if no rule have matched.
+    pub fn is_no_match(&self) -> bool {
+        !self.is_match()
+    }


IMO there's no benefit in writing if foo.is_no_match() over ! foo.is_match().

Suggested change

/// Returns true if no rule have matched.

pub fn is_no_match(&self) -> bool {

!self.is_match()

}

hm, I took inspiration from is_some()/is_none() from std, id say it makes it a little more readable but i see your point too

relay-sampling/src/config.rs

jan-auer · 2023-09-26T08:54:20Z

relay-sampling/src/condition.rs

+
+    use crate::dsc::TraceUserContext;
+    use crate::tests::{and, eq, glob, not, or};
+    use crate::DynamicSamplingContext;


NB: it's great you're moving these tests now. I hadn't moved them previously since they depend on the DSC, and these rules should be independent of implementations of the Getter trait.

In a follow-up, we'll move this module into another crate and then also update the tests.

I was debating the idea of making a dummy Getter struct, just to make it more clear that this crate shouldn't care about implementations, but since we want to keep DynamicSamplingContext in this crate then it's simpler just to use that. All the tests that used Event I've replaced with dsc.

relay-sampling/src/condition.rs

relay-sampling/src/evaluation.rs

jan-auer · 2023-09-26T09:23:44Z

relay-server/src/actors/processor.rs

-            state.envelope().dsc(),
-            state.event.value(),
-        );
+    fn compute_sampling_decision(


I'd suggest to revert this change for three reasons:

The parameter list of this function is long. That was also the reason why ProcessEnvelopeState was introduced in the first place, to pass processing information together.

We will soon need more fields from the state, such as access to Redis as well as the received timestamp (it is currently a bug that this function uses Utc::now()).

Associated functions like this one that neither work with the type that they are declared on, nor return it, are unidiomatic.

relay-server/src/lib.rs

relay-server/src/utils/dynamic_sampling.rs

jan-auer · 2023-09-26T09:35:32Z

relay-server/src/utils/dynamic_sampling.rs

+    let sampling_result: SamplingResult = evaluator.match_rules(dsc.trace_id, dsc, rules).into();
+    Some(sampling_result.should_keep())


Instead of converting to SamplingResult, consider to match the RuleMatchingState directly.

Suggested change

let sampling_result: SamplingResult = evaluator.match_rules(dsc.trace_id, dsc, rules).into();

Some(sampling_result.should_keep())

Some(match evaluator.match_rules(dsc.trace_id, dsc, rules) {

RuleMatchingState::SamplingMatch(m) => m.should_keep(),

_ => false,

})

hmm, im not sure if it's obvious that a lack of match should result in true here, case in point: in your example you return false whereas it should be true. Maybe it's better to use the implementation in SamplingResult for that?

if we return the bool manually we're essentially re-implementing this behaviour:

impl From<Evaluation> for SamplingResult { fn from(value: Evaluation) -> Self { match value { Evaluation::Matched(sampling_match) => Self::Match(sampling_match), Evaluation::Continue(_) => Self::NoMatch, } } } /// Returns `true` if the event should be kept. pub fn should_keep(&self) -> bool { match self { SamplingResult::Match(sampling_match) => sampling_match.should_keep(), // If no rules matched on an event, we want to keep it. SamplingResult::NoMatch => true, SamplingResult::Pending => true, } }

in the original implementation, it turns into Keep if there's no match in fn determine_from_sampling_match, which becomes 'true' here:

let sampled = match sampling_result { SamplingResult::Keep => true, SamplingResult::Drop(_) => false,

You're absolutely right, the false was an outright typo in my suggestion :) I was slightly concerned with the additional conversion into a sampling result, however my own typo is a good example of why this sort of deduplication should exist.

Let's keep it. A suggestion on how to format it, though:

let evaluation = evaluator.match_rules(dsc.trace_id, dsc, rules); Some(SamplingResult::from(evaluation).should_keep())

nice, thanks! that looks a lot nicer than what I wrote

iker-barriocanal

Have we considered splitting this refactor into smaller ones so that it's easier to review and identify regressions?

jan-auer · 2023-09-26T12:05:11Z

@iker-barriocanal next time we should try to make some of these changes in smaller increments. Now that review has almost concluded, we can keep the PR.

TBS1996 · 2023-09-26T12:06:52Z

Have we considered splitting this refactor into smaller ones so that it's easier to review and identify regressions?

I regret not having a separate PR for moving the tests, but other than that, the reason it's so huge is because the new implementation changed the API used for tests, and there was a lottt of tests here.

at this point it's easier to just go forward with it.

…tor/ref_ds

TBS1996 added 6 commits September 13, 2023 22:52

wip

c5fc6b1

wip

0f862f4

wip

7421f38

wip

3c3dc58

wip

d2b0691

wip

ba725fd

TBS1996 mentioned this pull request Sep 14, 2023

fix(cabi): Remove dynamic sampling abi #2515

Merged

TBS1996 added a commit that referenced this pull request Sep 14, 2023

fix(cabi): Remove dynamic sampling abi (#2515)

6921d70

1. We are refactoring dynamic sampling in #2514, making these tests outdated. 2. it references a dynamic sampling function id like to make private

TBS1996 added 9 commits September 14, 2023 14:32

wip

e3be3ac

wi

d820657

wip

1103266

fix rest of the tests

3d5bc18

wip

2e73e6f

turn samplingmatch into enum

d394dd8

fix

e33f654

Merge branch 'master' into tor/ref_ds

836dbc4

wip

0728311

TBS1996 commented Sep 14, 2023

View reviewed changes

relay-sampling/src/config.rs Outdated Show resolved Hide resolved

TBS1996 added 7 commits September 15, 2023 01:43

wip

5ec9cbf

fix

56e934f

wip

5af51d7

fix

b60a07d

hopefully fix tests

7d7c8c8

disable error msg

11b3dc8

wip

0d4effa

TBS1996 commented Sep 15, 2023

View reviewed changes

relay-sampling/src/config.rs Outdated Show resolved Hide resolved

TBS1996 commented Sep 15, 2023

View reviewed changes

relay-sampling/src/dsc.rs Outdated Show resolved Hide resolved

TBS1996 added 2 commits September 15, 2023 10:46

wip

da4bf88

ref samplingmode

1b0f3ff

TBS1996 commented Sep 15, 2023

View reviewed changes

relay-sampling/src/evaluation.rs Outdated Show resolved Hide resolved

TBS1996 commented Sep 23, 2023

View reviewed changes

relay-sampling/src/lib.rs Show resolved Hide resolved

wip

4b244c8

TBS1996 commented Sep 23, 2023

View reviewed changes

TBS1996 added 2 commits September 24, 2023 08:57

wip

9d40dd5

wip

1d53983