-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReadTimeoutErrors on 169.254.169.254 /latest/api/token #880
Comments
Can confirm this happens. Currently, attempting to set Update: Set the above variable and still seeing this Read Timeout
|
Thanks for the report! This is certainly related to the new cloud metadata we collect, but I'll need to investigate more to figure out what's going wrong, especially with |
@michaelhelmick Can you show me the ReadTimeouts that you're getting? I cannot see a way that we should be hitting any of the cloud metadata collection with |
@michaelhelmick Also are you in AWS or can you reproduce this outside of a cloud provider? |
It's on the same /latest/api/token endpoint. I curl'd inside my container and it wasn't available there either, but other metadata endpoints were. I am on AWS and don't have time to try and reproduce outside of a cloud provider, sorry! |
@michaelhelmick That's fine. I haven't been able to reproduce the hang, though I can reproduce the |
@ivasic I've never been able to reproduce your full hang. However, I fixed the other errors you were seeing in #884. It was a combination of a missing
It's still slower than I'd like, due to taking the whole 1.0 second timeout on the token call, but I think it's reasonable now. I would love it if you could test this for me, to see if you are still hanging. You can drop this into your
|
If it helps, mine appeared in celery logs in ECS. Did not appear in migrate or any web logs |
It seems to have helped! 🎉 I'm not sure how your change stopped the hangs but glad it worked. Wonder if it had to do something with the retries maybe? |
That's my best guess. But it still shouldn't have hung more than about 10-12 seconds, so I'm definitely confused... In any case, I'm glad the fix is working! |
After updating the agent to version 5.8.0 my all my Django
manage.py
commands seem to hang indefinitely and never finish and I'm first seeing:WARNING urllib3.connectionpool Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPConnectionPool(host='169.254.169.254', port=80): Read timed out. (read timeout=3.0)")': /latest/api/token connectionpool.py:749
and then (after some time):
ERROR elasticapm.transport Closing the transport connection timed out. base.py:261
After this last error message the process just hangs indefinitely and has to be killed manually.
To Reproduce
I'm not able to reproduce this locally, it seems to happen exclusively in my staging/production environment on AWS.
There, it really only takes a Django management command to make it hang. For example:
Environment
Linux 6094537dd0b9 4.14.181-108.257.amzn1.x86_64 elastic/apm-agent-python#1 SMP Wed May 27 02:43:03 UTC 2020 x86_64 GNU/Linux
(Docker container - python:3.8 image, docker running on Amazon Linux/2.15.2)Additional context
The app runs on AWS ElasticBeanstalk Docker environment with python 3.8 image, pretty straight forward.
requirements.txt
:Click to expand
The text was updated successfully, but these errors were encountered: