Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(feedback): enforce a max message size from all sources #79326

Merged
merged 13 commits into from
Oct 22, 2024

Conversation

aliu39
Copy link
Member

@aliu39 aliu39 commented Oct 17, 2024

Closes #76298
Closes SENTRY-3B86

Decided on a limit of 4096, generously below the LLM request, postgres, and kafka size limits described in the ticket. Messages that are too large will be truncated (or rejected, for crash report modal) and skip spam detection, auto-marking as spam.

Includes a small refactor of create_feedback.py, moving stuff around and commenting a bit. Renamed auto_ignore_spam_feedback to set_feedback_ignored.

@aliu39 aliu39 changed the title Truncate large msgs in create_feedback and skip spam detection. +some refactoring fix(feedback): enforce a max message size from all sources Oct 17, 2024
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 17, 2024
@aliu39 aliu39 marked this pull request as ready for review October 17, 2024 23:09
@aliu39 aliu39 requested review from a team as code owners October 17, 2024 23:09
@aliu39 aliu39 requested a review from JoshFerge October 17, 2024 23:10
tags={
"is_spam": is_message_spam,
"pow2_size_bucket": 2 ** math.ceil(math.log2(len(feedback_message))),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limits the granularity of this tag. I'd rather do this for viewing in datadog, instead of logging each msg size.

Copy link
Member

@JoshFerge JoshFerge Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant we just use a distribution metric type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'll add a log too so we can see the org/project id.

@aliu39 aliu39 requested a review from a team as a code owner October 17, 2024 23:34
Copy link
Contributor

This PR has a migration; here is the generated SQL for src/sentry/migrations/0778_userreport_comments_max_length.py ()

--
-- Alter field comments on userreport
--
-- (no-op)

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration lgtm

register(
"feedback.message.max-size",
type=Int,
default=4096,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need an option if the max-length is also in the schema?

Copy link
Member Author

@aliu39 aliu39 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max length schema is for legacy feedbacks only, aka user reports. User reports are shimmed to feedback issues, but not vice versa. IMO it makes sense to have a separate option for new feedback.

In the long-term it makes more sense to enforce a maximum on the upstream envelopes, but I need more time to look into that. This just plugs all holes for now, resolving that sentry issue. We can gather some metrics and have a flexible limit for now, using the option. Wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long-term it makes more sense to enforce a maximum on the upstream envelopes, but I need more time to look into that. This just plugs all holes for now, resolving that sentry issue.

That makes sense to me.

@aliu39 aliu39 requested review from markstory and JoshFerge and removed request for JoshFerge October 22, 2024 17:24
Copy link
Member

@JoshFerge JoshFerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some feedback: this PR could have been broken up for easier reviewing, but overall looks good. 👍🏼

@aliu39
Copy link
Member Author

aliu39 commented Oct 22, 2024

some feedback: this PR could have been broken up for easier reviewing, but overall looks good. 👍🏼

Got it, thanks for lmk!

Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 80.48780% with 8 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/feedback/usecases/create_feedback.py 72.41% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #79326      +/-   ##
==========================================
+ Coverage   78.33%   78.35%   +0.01%     
==========================================
  Files        7125     7123       -2     
  Lines      314677   314900     +223     
  Branches    51431    51464      +33     
==========================================
+ Hits       246515   246739     +224     
+ Misses      61698    61690       -8     
- Partials     6464     6471       +7     

@aliu39 aliu39 merged commit 477e69f into master Oct 22, 2024
49 of 50 checks passed
@aliu39 aliu39 deleted the aliu/limit-feedback-size branch October 22, 2024 20:21
cmanallen pushed a commit that referenced this pull request Oct 23, 2024
Closes #76298
Closes [SENTRY-3B86](https://sentry.sentry.io/issues/5552524761/)

Decided on a limit of 4096, generously below the LLM request, postgres,
and kafka size limits described in the ticket. Messages that are too
large will be truncated (or rejected, for crash report modal) and skip
spam detection, auto-marking as spam.

Includes a small refactor of `create_feedback.py`, moving stuff around
and commenting a bit. ~~Renamed `auto_ignore_spam_feedback` to
`set_feedback_ignored`.~~
Copy link

sentry-io bot commented Oct 30, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ VertexRequestFailed: Response 429: { sentry.tasks.store.save_event_feedback View Issue

Did you find this useful? React with a 👍 or 👎

@github-actions github-actions bot locked and limited conversation to collaborators Nov 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UF Backend: Limit feedback message size
4 participants