Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(replays): add recordings payload ItemType #1236

Merged
merged 16 commits into from
Jun 13, 2022

Conversation

JoshFerge
Copy link
Member

@JoshFerge JoshFerge commented Apr 22, 2022

  • Adds an ItemType for replay recordings
  • Logic to set the ItemType based on the name of the file in multitype logic
  • Same flow as attachments, but sent to a different topic
  • An ItemType was chosen as opposed to another attachment type as we'll likely begin to diverge from attachments logic and this gives us more flexibility long term.

See the spec here for more background information:
https://www.notion.so/sentry/Session-Replay-V1-alpha-Ingest-Backend-ae068d1e1d514221b6c3ea2233f360f4


By submitting this pull request, I confirm that Sentry can use, modify, copy, and redistribute this contribution, under Sentry's choice of terms.

@@ -218,7 +220,12 @@ fn consume_item(
let file_name = content_disposition.as_ref().and_then(|d| d.get_filename());

if let Some(file_name) = file_name {
let mut item = Item::new(ItemType::Attachment);
let item_type = match file_name {
REPLAY_RECORDING_FILENAME => ItemType::ReplayRecording,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should only ever be 1 replay attachment per envelope, not sure if we should take that into account here / other parts where it can accept N.

We could also make replay recording uploads a separate endpoint altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this you could return true from is_duplicate. What's the reason behind this constraint? It could be beneficial for clients with offline caching to send several replay payloads in a single envelope.

assert message.error() is None

v = msgpack.unpackb(message.value(), raw=False, use_list=False)
assert v["type"] == "attachment_chunk", v["type"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would probably want to have this type be replay_chunk for cleanliness sake, as well as attachment_type be something else, but for now i don't think these matter.

@JoshFerge JoshFerge force-pushed the jferg/replay-recordings branch from 40aed88 to b45dea2 Compare April 22, 2022 01:42
@JoshFerge JoshFerge requested a review from a team April 22, 2022 01:43
@JoshFerge JoshFerge force-pushed the jferg/replay-recordings branch from b45dea2 to f5c7942 Compare April 22, 2022 01:44
@JoshFerge JoshFerge force-pushed the jferg/replay-recordings branch from f5c7942 to b910e80 Compare April 22, 2022 16:42
@JoshFerge JoshFerge marked this pull request as ready for review April 27, 2022 16:26
@JoshFerge JoshFerge requested a review from a team April 27, 2022 16:26
@JoshFerge JoshFerge changed the title feat(replays): add ItemType for recordings feat(replays): add recordings payload ItemType Apr 28, 2022
Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, initial review attached. We had two general thoughts when reading through this change:

  • Right now, this PR does not add any rate limiting capabilities and also does not track data categories. This is perfectly fine for the initial implementation. Once you want to check for rate limits or track outcomes, you'll also have to add a new DataCategory.
  • Since this is experimental, it would be good to add a feature flag that allows us to disable and drop replays. For an example of how to do this, see process_profiling:

fn process_profiles(&self, state: &mut ProcessEnvelopeState) {
let profiling_enabled = state.project_state.has_feature(Feature::Profiling);

relay-server/src/envelope.rs Outdated Show resolved Hide resolved
relay-server/src/envelope.rs Outdated Show resolved Hide resolved
relay-server/src/envelope.rs Outdated Show resolved Hide resolved
relay-server/src/actors/store.rs Outdated Show resolved Hide resolved
relay-server/src/actors/store.rs Outdated Show resolved Hide resolved
relay-server/src/actors/store.rs Outdated Show resolved Hide resolved
relay-server/src/utils/multipart.rs Outdated Show resolved Hide resolved
@@ -830,6 +835,7 @@ impl Default for TopicAssignments {
sessions: "ingest-sessions".to_owned().into(),
metrics: "ingest-metrics".to_owned().into(),
profiles: "profiles".to_owned().into(),
replay_payloads: "ingest-replay-payloads".to_owned().into(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before deploying, please ensure the topic is provisioned in the broker.

@@ -27,7 +27,7 @@ pub fn check_envelope_size_limits(config: &Config, envelope: &Envelope) -> bool
| ItemType::Security
| ItemType::RawSecurity
| ItemType::FormData => event_size += item.len(),
ItemType::Attachment | ItemType::UnrealReport => {
ItemType::Attachment | ItemType::UnrealReport | ItemType::ReplayPayload => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since replays will become their own thing (including for rate limiting), consider having separate limits for them. This relates to one of the comments in the store actor, since you may want to have smaller limits per replay item (such as 1MB). The limit for attachments is at 100MB right now.

tests/integration/test_replay_payloads.py Outdated Show resolved Hide resolved

// This skips chunks for empty replay recordings. The consumer does not require chunks for
// empty replay recordings. `chunks` will be `0` in this case.
while offset < size {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these bits are copied from the attachment chunks produce function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for calling this out. For now this is totally fine, once we have a third use case, we can still generalize the chunking implementation.

@JoshFerge JoshFerge force-pushed the jferg/replay-recordings branch from e35ea6c to 73845e0 Compare June 3, 2022 04:39

Ok(ChunkedAttachment {
id,
name: REPLAY_RECORDINGS_ATTACHMENT_NAME.to_owned(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't forsee a use for the name field for us, but didn't seem worth defining our own type just to remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping some constant name here is a good solution to distinguish the message contents in other places 👍

If all you need here are id, chunks, and size, then you can also add those fields directly to your ReplayRecordingKafkaMessage, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just went ahead and defined a separate type with just these 3 fields 👍🏼

@JoshFerge
Copy link
Member Author

alright ready for re-review. added comments where i had questions / thoughts. only other thing is if using event_id as the field in the envelope headers is okay, or if we should add replay_id as a top level header field.

@JoshFerge
Copy link
Member Author

hmm not sure why the integration test is failing, passes on my machine.

@JoshFerge JoshFerge requested a review from jan-auer June 3, 2022 04:56
@JoshFerge JoshFerge force-pushed the jferg/replay-recordings branch from 73845e0 to 3e3882c Compare June 3, 2022 05:03
Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough work and applying all feedback. This looks good and is ready to merge from my end. See minor stylistic comments below.

CHANGELOG.md Outdated Show resolved Hide resolved
relay-server/src/actors/envelopes.rs Outdated Show resolved Hide resolved

Ok(ChunkedAttachment {
id,
name: REPLAY_RECORDINGS_ATTACHMENT_NAME.to_owned(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping some constant name here is a good solution to distinguish the message contents in other places 👍

If all you need here are id, chunks, and size, then you can also add those fields directly to your ReplayRecordingKafkaMessage, though.


// This skips chunks for empty replay recordings. The consumer does not require chunks for
// empty replay recordings. `chunks` will be `0` in this case.
while offset < size {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for calling this out. For now this is totally fine, once we have a third use case, we can still generalize the chunking implementation.

relay-server/src/actors/store.rs Outdated Show resolved Hide resolved
* master:
  ref(metrics): Stop logging relative bucket size (#1302)
  fix(metrics): Rename misnamed aggregator option (#1298)
  fix(server): Avoid a panic in the Sentry middleware (#1301)
  build: Update dependencies with known vulnerabilities (#1294)
  fix(metrics): Stop logging statsd metric per project key (#1295)
  feat(metrics): Limits on bucketing cost in aggregator [INGEST-1132] (#1287)
  fix(metrics): Track memory footprint more accurately (#1288)
  build(deps): Bump dependencies (#1293)
  feat(aws): Add relay-aws-extension crate which implements AWS extension as an actor (#1277)
  fix(meta): Update codeowners for the release actions (#1286)
  feat(metrics): Track memory footprint of metrics buckets (#1284)
@JoshFerge JoshFerge merged commit 3c49bca into master Jun 13, 2022
@JoshFerge JoshFerge deleted the jferg/replay-recordings branch June 13, 2022 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants