-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
if getaddrinfo fails, we back off in a exponential curve. We should limit the back off 5 minutes maximum period #216
Comments
Hi guys, EAI_AGAIN errors are ones that indicate an intermittent connectivity failure with the DNS servers themselves https://stackoverflow.com/questions/40182121/error-getaddrinfo-eai-again normally due to intermittent connections. I have tested this in some detail now, the backoff is limited at 3:00 minutes by default - which I can verify. The back-off factor is 2, so backoff times should be doubling, which they are not here, it looks like an intermittent DNS lookup issue as the interval between these failures, instead of doubling, contracts and expands arbitrarily. Are we absolutely sure this is not a red-herring? |
Captured log in issue is filtered to only show how often we see eagain. Please ignore the delta time, and review the date stamp of the log. We see on the 14th, we only try connect 3 times all day. So backoff is not working as expected. |
I still don't think this is primus's back-off - which I can verify as being limited to 3 minutes, it is rather an emergent issue that is a result of the event loop being blocked. Found this: nodejs/node#8436 - so Johan was on the right track working around this by doing a DNS resolve first. I am looking at doing so on the happn client as well and have made some headway testing for the issue. What is nice is that this may also be linked to happner/happner-2#229, so solving this may kill 2 birds with one stone. |
Hi @southbite, I've discussed with Craig and this is a problem when we try to connect at startup and there is no internet connection. Once we are connected and the connection breaks we don't have this problem which means primus and happn are acting as expected. I think this ticket can close. |
TR4248
If getaddrinfo fails, we back off in an exponential curve. We should be able to configure the maximum back-off time.
Following extracted from the AE log attached.
The text was updated successfully, but these errors were encountered: