Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limiting seemingly disabled as of 8.13.0 #5227

Closed
HakShak opened this issue Apr 11, 2017 · 42 comments
Closed

Rate Limiting seemingly disabled as of 8.13.0 #5227

HakShak opened this issue Apr 11, 2017 · 42 comments
Assignees

Comments

@HakShak
Copy link

HakShak commented Apr 11, 2017

Details are here: https://forum.sentry.io/t/no-rate-limiting-after-upgrade/1114

Highlights:

  • Two confirmed cases of updating to or beyond 8.13.0 and rate limiting not having an effect.
  • Tried setting it in the UI and forcing it in the conf.
  • Confirmed in 8.14.1, haven't had a chance to try 8.15.0 yet.
@timfeirg
Copy link

timfeirg commented May 25, 2017

I'm on 8.16.0 and tried tweaking various configurations according to the docs, rate-limiting is obviously not working @mattrobenolt @dcramer

@boboli
Copy link

boboli commented May 30, 2017

Also seeing it on 8.14.1. This bug is causing us to lag behind on new events for hours during spikes of high error rates, as our sentry servers can't keep up with the firehose of inbound events. This is really frustrating, as it is during these times that getting visibility into the latest errors is critical to reducing downtime. Can we get please this prioritized?

@dcramer
Copy link
Member

dcramer commented May 30, 2017

@boboli we will try to look at this, but at the end of the day if you're using sentry for free (open source) you have to realize that these requests may slip for us. Only so many resources, and if we dont see issues on our SaaS service things become a lower priority (unless its a core brekaing issue). In this case we use different rate limiting code in production, so its likely theres a flaw in the builtin that the tests aren't catching.

@bqbn
Copy link

bqbn commented Jun 7, 2017

We have been experiencing this issue as well since we upgraded Sentry from v8.10 to v8.16.1. The bug is definitely annoying because it causes hours of lag behind on new events from time to time.

I think that with people upgrading to newer versions, more and more will realize they lose rate limiting function and google-find this issue. Or, if they see this issue first, they may decided to delay the upgrade (I know I'd have delayed if I knew it).

Hopefully the bug can be fixed soon. We have done the db migration, I don't think we can downgrade at this point.

@dcramer
Copy link
Member

dcramer commented Jun 7, 2017

I'm planning to look into this soon (hopefully this week), just have been spending my free cycles on random smaller features.

@HakShak
Copy link
Author

HakShak commented Jun 29, 2017

@HakShak HakShak closed this as completed Jun 29, 2017
@dcramer
Copy link
Member

dcramer commented Jun 29, 2017

I actually had looked at this recently and couldn't find a bug, so maybe I was just overlooking things and people were asking for a feature that we had simply removed

@HakShak
Copy link
Author

HakShak commented Jun 30, 2017

Yeah, the DSN rate limiting works, but I still don't see the "organization level project percentage" working at all.

@dcramer dcramer reopened this Jun 30, 2017
@dcramer
Copy link
Member

dcramer commented Jun 30, 2017

@HakShak will take a look -- we're likely going to redo those features, so I would suggest not relying on the current behavior too heavily, but they SHOULD still work.

@tkaemming tkaemming self-assigned this Jul 7, 2017
@dcramer
Copy link
Member

dcramer commented Jul 7, 2017

FYI investigation (via tests) is here https://github.com/getsentry/sentry/pull/5688/files

@JarenGlover
Copy link

@dcramer / @HakShak

Is the take away here that with version 8.17 the rate limiting feature per key is working but the "organization level project percentage" is not?

It looks like you can set the system wide rate limit via the UI or via config. Does one value supersede the other?

generally speaking looks like rate limiting per project isn't working for me. Could I be missing something. Thanks in advance.

@dcramer
Copy link
Member

dcramer commented Jul 21, 2017

@JarenGlover ive not identified anything as not working, but if something truly is it would help to have more detail on what isn't working, and what configuration is setup to cause it.

@boboli
Copy link

boboli commented Jul 21, 2017

@dcramer: We notice 2 things that are not working.

Global Rate Limit

We've set the global rate limit to 3000 using the UI under the admin settings:
image

But this is clearly not working just by doing some simple math from our stats page:
screen shot 2017-07-21 at 1 22 54 pm
2.6E6 events in 1 hour ~= 43K events/minute

We've tried setting this in the conf too, but not sure if we're doing it right:

SENTRY_QUOTAS_OPTIONS = {
    "cluster": "quotas",
    "system.rate-limit": 3000
}

This setting definitely used to work in the past, I believe in version 8.12, as the stats page used to show bars of red to show dropped events.

Per-DSN Rate Limit

We then tried setting the per-DSN rate limiting (set to 1000/min) in the UI, and we can see from this screenshot that this isn't working either:
screen shot 2017-07-21 at 1 30 01 pm
Simple math: 9.7E6 events in 1 day ~= 6736 events/minute

Setup

We're using Sentry 8.17.0 on Python 2.7.6/Ubuntu 14.04 in AWS EC2 (not using Docker), with Redis for our buffer, ratelimiter store, quota store, TSDB, digests; rabbitmq for our queue; memcached for cache; and AWS RDS PostgreSQL for node storage. I can give you more details on our setup if needed.

@dcramer
Copy link
Member

dcramer commented Jul 25, 2017

@boboli are you using single organization mode (the default)?

@boboli
Copy link

boboli commented Jul 25, 2017

@dcramer: yup, have had it on since we first provisioned sentry (v8.7.0).

@dcramer
Copy link
Member

dcramer commented Jul 25, 2017

Can you confirm what SENTRY_QUOTAS is set to?

@boboli
Copy link

boboli commented Jul 25, 2017

SENTRY_QUOTAS = 'sentry.quotas.redis.RedisQuota'
SENTRY_QUOTAS_OPTIONS = {
    "cluster": "quotas",
    "system.rate-limit": 1000
}

@dcramer
Copy link
Member

dcramer commented Jul 25, 2017

@boboli and quotas exists in the output of:

$ sentry config get redis.clusters
        type: DICTIONARY
 from config: {'default': {'hosts': {0: {'host': '127.0.0.1', 'port': 6379}}}}
     current: {'default': {'hosts': {0: {'host': '127.0.0.1', 'port': 6379}}}}

?

@boboli
Copy link

boboli commented Jul 25, 2017

Yup:

(sentry_venv) robinhood@ip-10-1-30-102:~/sentry$ sentry config get redis.clusters
        type: DICTIONARY
 from config: {'default': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 5, 'port': 6379}}}, 'ratelimiter': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 1, 'port': 6379}}}, 'buffer': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 2, 'port': 6379}}}, 'cache': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 0, 'port': 6379}}}, 'digests': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 4, 'port': 6379}}}, 'quotas': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 3, 'port': 6379}}}, 'tsdb': {'hosts': {0: {'host': 'sentry-tsdb.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 0, 'port': 6379}}}}
     current: {'default': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 5, 'port': 6379}}}, 'ratelimiter': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 1, 'port': 6379}}}, 'buffer': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 2, 'port': 6379}}}, 'cache': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 0, 'port': 6379}}}, 'digests': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 4, 'port': 6379}}}, 'quotas': {'hosts': {0: {'host': 'sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 3, 'port': 6379}}}, 'tsdb': {'hosts': {0: {'host': 'sentry-tsdb.uon5mu.ng.0001.use1.cache.amazonaws.com', 'db': 0, 'port': 6379}}}}

@dcramer
Copy link
Member

dcramer commented Jul 25, 2017

Lastly, can you confirm the version of Redis? Given no rate limits are working, it makes me believe that this is something thats not core app code causing the issue.

@boboli
Copy link

boboli commented Jul 25, 2017

sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com:6379> INFO
# Server
redis_version:2.8.24

For the record, we are using AWS Elasticache for this redis cluster, so it might be an Amazon-flavored version of redis as opposed to the open-source one.

Maybe this helps for debugging but I ran KEYS * on the db number we use for quotas (#3), and it came up with nothing but:

sentry-buffer.uon5mu.ng.0001.use1.cache.amazonaws.com:6379> KEYS *
1) "ElastiCacheMasterReplicationTimestamp"

@dcramer
Copy link
Member

dcramer commented Jul 25, 2017

Can anyone else here also confirm Redis version? I'm wondering if somehow the LUA isnt working as expected here for a specific runtime.

@bqbn
Copy link

bqbn commented Jul 25, 2017

We use AWS Elasticache too, and our redis version is v3.2.4.

$ echo INFO | nc $REDIS 6379 | egrep redis_version
redis_version:3.2.4

@HakShak
Copy link
Author

HakShak commented Jul 25, 2017

Using this docker image for our redis: redis:3.2-alpine

@bqbn
Copy link

bqbn commented Jul 25, 2017

Oh, sorry and correction: our prod environment, which is experiencing the no-rate-limiting issue, uses redis v2.8.23. Our stage environment uses redis v3.2.4. However, we can't tell if the later has the same issue or not because it has almost no incoming traffic.

@boboli
Copy link

boboli commented Aug 7, 2017

@dcramer: any updates on the investigation? I'd be willing to provide additional details if it helps to debug.

@dcramer
Copy link
Member

dcramer commented Aug 7, 2017

@boboli i haven't had time. We'd like to resolve this, but its not affecting everyone, and its definitely not affecting us, so it'll happen when there's an obvious reason it's broken or someone has spare cycles to look into it

@JarenGlover
Copy link

@HakShak are you saying the issue no longer exist when you use Redis 3.2?

@HakShak
Copy link
Author

HakShak commented Aug 7, 2017 via email

@dcramer
Copy link
Member

dcramer commented Aug 7, 2017

@HakShak what do you mean by its not very accurate? It's using the same system as core rate limits.

@HakShak
Copy link
Author

HakShak commented Aug 7, 2017

We have a DSN right now set to 100 per minute, that projects has about 7k an hour on the low end and gets up to 15k an hour on the peak. Checking it now, I actually don't see any events being throttle by it.

I still subscribe to your theory @dcramer that something outside the Sentry codebase is causing this.

I saw DSN throttling working when we first tried it, but something has either lingered around in the DB or built up in the TSDB.

One thing which might be "unique" about our setup is that we use different Redis instances for the TSDB and the queue.

@tkaemming
Copy link
Contributor

Here are details on the issues with the DSN rate limits: GH-6007

@boboli
Copy link

boboli commented Aug 31, 2017

@tkaemming: Does the bug also affect the global rate limit logic as well?

@tkaemming
Copy link
Contributor

The specific issue in the ticket linked above wouldn't affect global rate limits (the code involved is isolated to the rate limits associated with per-project keys.) We haven't ruled out any other issues related to global limits though — we just haven't been able to reproduce them ourselves.

@tkaemming
Copy link
Contributor

Following up: the issue with per-project key rate limits has been fixed on sentry.io and will be part of the 8.20 release.

@timfeirg
Copy link

I've upgrade to 8.20 and rate limits aren't working. From the settings page I've confirmed that my account is limited to a maximum of 20 events per 60 seconds, but sentry stats page shows 1191 EVENTS PER MINUTE and none are dropped. @tkaemming

@jinzhao1994
Copy link

I've upgrade sentry from 8.6.0 to 8.22.0, then the rate limit seems not work. The code here returns the min value of system_limit and account_limit. In single organization mode, account_limit is zero (which means no limit.). So this method always return zero which means no limit.

After run the code below to set quota manually in python console, rate limit works.

OrganizationOption.objects.set_value(
    organization=Organization.objects.first(),
    key='sentry:account-rate-limit',
    value=50000)

@alexef
Copy link
Contributor

alexef commented Mar 4, 2018

I can confirm the same behaviour as reported by @jinzhao1994 - also running sentry 8.22.0.

@flyingmachine
Copy link

Just want to chime in and add that system.rate-limit does not work on my install of 8.20

@boboli
Copy link

boboli commented Mar 15, 2018

@tkaemming : it seems like @jinzhao1994's investigation found the problem. Can we take another look at solving this issue for the next release?

@dcramer
Copy link
Member

dcramer commented Mar 15, 2018

@boboli et all -- we'd like to fix this, but its not a high priority for us. You have to realize that most people contributing to Sentry work for Sentry, and Sentry's goal is our SaaS. If you're using Sentry, especially for free (which is likely true if you're not using sentry.io), you might consider asking your employer for some time to help accomplish the goals, and we're happy to provide guidance around it.

dcramer added a commit that referenced this issue Jul 2, 2018
@dcramer
Copy link
Member

dcramer commented Apr 4, 2019

Rate limits have been entirely pushed towards per-key limits at this point. I'm closing this out as we dont intend to continue to invest in other kinds of rate limits. I would suggest using nginx or similar to have more granular control over how you handle load-based rate limit concerns. You can find examples of this in our docs: https://docs.sentry.io/server/throttling/

We should confirm this is disabled in future versions of Sentry, but I believe it already should be.

@dcramer dcramer closed this as completed Apr 4, 2019
@github-actions github-actions bot locked and limited conversation to collaborators Dec 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants