-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(event): Add correct limit and validation to distribution name [INGEST-1615] #1556
Conversation
Even though we had the `max_chars` limit set to 200 on the `dist` we never validated that length while trying to normalize the event. This change adds the correct limit (64 chars) which is set on the Sentry site (field in the DB is varchar(64)) and also adds the validation with proper error generation if the value longer than the length limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is actually a bugfix, please add the entry to the changelog.
relay-general/src/store/normalize.rs
Outdated
*distribution = Annotated::from_error( | ||
Error::new(ErrorKind::ValueTooLong), | ||
Some(Value::String(val.to_owned())), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a new Annotated
object, does it make sense to just add the error to the meta? Something like the following:
*distribution = Annotated::from_error( | |
Error::new(ErrorKind::ValueTooLong), | |
Some(Value::String(val.to_owned())), | |
) | |
distribution.meta_mut().add_error(...); |
In this case, you'd still have the initial string there which is too long, so I'm not sure if this suggestion makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rewritten this using apply
method
relay-general/src/store/normalize.rs
Outdated
Some(Value::String(val.to_owned())), | ||
) | ||
} else if trimmed != val { | ||
*val = trimmed.to_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, why do you assign the new value this way using references, instead of something like distribution.set_value(...)
? Same question for *dist = None
above.
relay-general/src/store/normalize.rs
Outdated
@@ -90,20 +90,25 @@ pub fn is_valid_platform(platform: &str) -> bool { | |||
VALID_PLATFORMS.contains(&platform) | |||
} | |||
|
|||
pub fn normalize_dist(dist: &mut Option<String>) { | |||
let mut erase = false; | |||
pub fn normalize_dist(distribution: &mut Annotated<String>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this change.
relay-general/src/store/normalize.rs
Outdated
normalize_dist(&mut dist); | ||
assert_eq!(dist.unwrap(), ""); // Not sure if this is what we want | ||
assert_eq!(dist.value(), Some(&"".to_string())); // Not sure if this is what we want |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalization should be idempotent. If we make another normalization call, dist
is then None
. This situation must not happen -- we're running multiple layers of relays and debugging can be complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also take the chance and make a decision with regard to that comment. If the resulting string after trimming is empty, dist
should be None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed and fixed this
let trimmed = dist.trim(); | ||
if trimmed.is_empty() { | ||
return Err(ProcessingAction::DeleteValueHard); | ||
} else if bytecount::num_chars(trimmed.as_bytes()) > MaxChars::Distribution.limit() { | ||
meta.add_error(Error::new(ErrorKind::ValueTooLong)); | ||
return Err(ProcessingAction::DeleteValueSoft); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same behaviour as in case of release
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that in sentry we trim tags instead of deleting them, so I wonder which way we should go here: https://github.com/getsentry/sentry/blob/3a7c72d622b1fe64fa26ef568e20493be7bbee9e/src/sentry/event_manager.py#L178
Especially because getsentry/sentry#40640 truncates instead of deleting, so I feel like we shouldn't delete it in Relay? |
This is a good point. Deleting it while annotating the error gives the user explicit information about what went wrong, which is probably better than silently truncating. Another option would be to truncate and annotate, like the |
Even though we had the
max_chars
limit set to 200 on thedist
we never validated that length while trying to normalize the event.This change adds the correct limit (64 chars) which is set on the Sentry site (field in the DB is varchar(64)) and also adds the validation with proper error generation if the value is longer than the length limit.
Also related: getsentry/sentry#40640