Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dynamic-sampling): Add possibility to run dynamic sampling from sentry-relay #2091

Merged
merged 16 commits into from
May 8, 2023

Conversation

iambriccardo
Copy link
Member

@iambriccardo iambriccardo commented May 3, 2023

This PR adds a new function to sentry-relay called run_dynamic_sampling which will run dynamic sampling on a two SamplingConfigs.

@iambriccardo iambriccardo changed the title feat(dynamic-sampling): Add possibility to run dynamic sampling from feat(dynamic-sampling): Add possibility to run dynamic sampling from sentry-relay May 3, 2023
@iambriccardo iambriccardo requested a review from jjbayer May 4, 2023 06:14
@iambriccardo iambriccardo marked this pull request as ready for review May 4, 2023 06:14
@iambriccardo iambriccardo requested a review from a team May 4, 2023 06:14
Comment on lines 361 to 363
let sampling_config = serde_json::from_str::<SamplingConfig>((*sampling_config).as_str())?;
let root_sampling_config =
serde_json::from_str::<SamplingConfig>((*root_sampling_config).as_str())?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to use the ? operator for curiosity and I don't understand why the compiler allows me to return a Result when the type is RelayStr.

CHANGELOG.md Outdated
@@ -13,6 +13,7 @@
- Scrub sensitive keys (`passwd`, `token`, ...) in Replay recording data. ([#2034](https://github.com/getsentry/relay/pull/2034))
- Add document_uri to csp filter. ([#2059](https://github.com/getsentry/relay/pull/2059))
- Store `geo.subdivision` of the end user location. ([#2058](https://github.com/getsentry/relay/pull/2058))
- Add new FFI function for running dynamic sampling. ([#2091](https://github.com/getsentry/relay/pull/2091))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to py/CHANGELOG.md

@@ -262,3 +263,25 @@ def validate_project_config(config, strict: bool):
error = decode_str(raw_error, free=True)
if error:
raise ValueError(error)


def run_dynamic_sampling(sampling_config, root_sampling_config, dsc, event):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take an entire envelope as interface, instead? In Relay's pipeline, DS also reads information from other envelope headers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep it stupid simple for now, in case we want, we can extend it to have an envelope as well.

Comment on lines 277 to 287
return json.loads(
decode_str(
rustcall(
lib.run_dynamic_sampling,
encode_str(sampling_config),
encode_str(root_sampling_config),
encode_str(dsc),
encode_str(event),
)
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add free=True to decode_str, since the owned string returned by FFI needs to be released. You can make this more readable like so:

Suggested change
return json.loads(
decode_str(
rustcall(
lib.run_dynamic_sampling,
encode_str(sampling_config),
encode_str(root_sampling_config),
encode_str(dsc),
encode_str(event),
)
)
)
result_json = rustcall(
lib.run_dynamic_sampling,
encode_str(sampling_config),
encode_str(root_sampling_config),
encode_str(dsc),
encode_str(event),
)
return json.loads(decode_str(result_json, free=True))

@@ -283,3 +283,268 @@ def test_validate_project_config():
with pytest.raises(ValueError) as e:
sentry_relay.validate_project_config(json.dumps(config), strict=True)
assert str(e.value) == 'json atom at path ".foobar" is missing from rhs'


def test_run_dynamic_sampling_with_valid_params_and_match():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are quite elaborate and will have to change every time we change something about sampling internals, particularly because they check the contents of the returned rules. Should we instead make more general checks, such as assert "merged_sampling_configs" in result and leave the rest to the DS unit tests in Rust?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good idea yes!

use std::cmp::Ordering;
use std::ffi::CStr;
use std::os::raw::c_char;
use std::slice;

use crate::core::{RelayBuf, RelayStr};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please order imports according to the style guide. This concerns the chrono and crate imports.

serde_json::from_str::<SamplingConfig>((*root_sampling_config).as_str())?;
// We can optionally accept a dsc and event.
let dsc = serde_json::from_str::<DynamicSamplingContext>((*dsc).as_str());
let event = serde_json::from_str::<EphemeralEvent>((*event).as_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of parsing some synthetic EphemeralEvent, you should just load an actual Annotated<Event> directly:

Suggested change
let event = serde_json::from_str::<EphemeralEvent>((*event).as_str());
let event = Annotated::<Event>::from_json((*event).as_str())?;
let event = event.value().ok_or(<some error>)?;

Comment on lines 356 to 359
sampling_config: *const RelayStr,
root_sampling_config: *const RelayStr,
dsc: *const RelayStr,
event: *const RelayStr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass all of these as references and then don't have to worry about dereferencing the pointer below. This is something we've not done consistently in the codebase, but we should move towards:

Suggested change
sampling_config: *const RelayStr,
root_sampling_config: *const RelayStr,
dsc: *const RelayStr,
event: *const RelayStr,
sampling_config: &RelayStr,
root_sampling_config: &RelayStr,
dsc: &RelayStr,
event: &RelayStr,

@iambriccardo iambriccardo requested a review from jan-auer May 5, 2023 08:31
Comment on lines 6 to 7
use anyhow::Context;
use chrono::Utc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move these two imports to the next import group (above once_cell).

@iambriccardo iambriccardo merged commit 6e5bab1 into master May 8, 2023
@iambriccardo iambriccardo deleted the riccardo/feat/add-sampling-to-relay-server branch May 8, 2023 08:42
jan-auer added a commit that referenced this pull request May 9, 2023
* master: (178 commits)
  ref(crons): Allow `_` in monitor slugs (#2100)
  fix(span-metrics): Scrub subdomains in metric tags (#2103)
  feat(normalization): Scrub span description URLs (#2095)
  feat(dynamic-sampling): Add possibility to run dynamic sampling from `sentry-relay` (#2091)
  ref(dynamic-config): Remove transaction metrics allowlist from Sentry (#2092)
  feat(protocol): Support old effective-directory format (#2048)
  ref(spans): Add more tags to span metrics (#2090)
  fix(server): Support HTTP half-close connections (#2089)
  ref(spans): Use transactions namespace with `span` prefix (#2087)
  ref(spans): Update span metrics feature flag name (#2086)
  feat(metrics): Extract more metrics from spans (#2080)
  build: Bump tempfile to remove flagged dependency (#2085)
  feat(filter): Add document_uri csp filter (#2059)
  build: Use jemalloc on Linux (#2084)
  ci(gha): add debugging for "Couldn'\''t write tracker file" issue (#2081)
  feat(crons): Pass-through minimal `sdk` field (#2073)
  fix(quota): Parse large limits (#2079)
  instr(server): More details on unexpected envelope drop (#2077)
  fix(ci): Skip flaky integration test (#2076)
  feat(dynamic-sampling): Add support for `trace.replay_id` in matching rules (#2070)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants