-
-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oathkeeper (v0.14.2_oryOS.10) returning empty reply on slow/long distance database calls #178
Comments
Could you show the (truncated) db connection string? |
Also, we might not fix this because there are some changes incoming with the next release: #177 |
Nothing fancy. postgres://:@haproxy.default.svc.cluster.local:5432/auth?sslmode=verify-ca I tried to up the max_conns but didn't see any positive results. I dug through the code and it doesn't look like that'd help a ton anyway. Once work starts on the issue referenced, I may be able to dedicate some time to helping out with it. |
Ok, damn - that was what I thought could work. Maybe there is a timeout based on the transaction size? I don't really know... What do you think of the proposal laid out in #177 ? Would it suit your use case? |
I can keep poking around and add some logging when I get a chance. I assume that part of it is that its making multiple queries, and 70ms+N adds up. Writing those to be nested with one query would probably be faster overall, but..... #177 is definitely a big step in the right direction. Localized reads stored in memory will be a lot faster than constant db reads. Things would get a little trickier with hot reloads because you'd have to read repeatedly, but the odds of a KV store being faster than DB lookups are pretty strong. I have some other thoughts about #177 that probably aren't appropriate for here. If I get time, I'll hop on discord to rant. |
Please do! Also feel free to post them in #177 ! |
I was having the same problem with 15 rules due to the same cross region issue with europe-east and asia-south and trace the reason. In my case, it was happening due to sequential queries in ListRules logic
So I changed the query and used SQL joins that worked for me and it was necessary as we were in production with oathkeeper so it may be a workaround for you as after #177 it won't be a case |
@ridhamtarpara Would you mind posting the query? I'd like to take a look at it. |
Describe the bug
I have Oathkeeper running in an East US kubernetes cluster, and the DB is currently living in West US. There’s about a 70ms lag between the two sites.
When running the following curl in the same k8s cluster as oathkeeper, I’m seeing the following
curl:
root@shellpod-bd-55c5b5cc74-nlmrh:/# curl “http://oathkeeper-api.hydra.svc.cluster.local:4456/rules?limit=50000&offset=0”
curl: (52) Empty reply from server
[oathkeeper-api-7f4cc5cb9f-dbmr6] time=“2019-04-24T18:50:58Z” level=info msg=“started handling request” method=GET remote=“10.39.0.47:60820” request="/rules?limit=50000&offset=0"
[oathkeeper-api-7f4cc5cb9f-dbmr6] time=“2019-04-24T18:51:24Z” level=info msg=“completed handling request” measure#oathkeeper-api.latency=26073727174 method=GET remote=“10.39.0.47:60820” request="/rules?limit=50000&offset=0" status=200 text_status=OK took=26.073727174s
Running what I believe to be the same query from the postgres logs and the code (https://github.com/ory/oathkeeper/blob/master/rule/manager_sql.go#L79) from the same pod issuing the curl requests returns within a few ms.
There are currently around 50 roles configured. When I poll with numbers less than 20 (specifically, 19) it works. Once I try 20, it times out just over 10 seconds.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
For things to not time out :-)
Version:
Additional context
I can reproduce this pretty easily if needed.
The text was updated successfully, but these errors were encountered: