-
Notifications
You must be signed in to change notification settings - Fork 102
Allow TCP and UDP sender creation for unresolvable hostnames #252
Comments
Apparently at least one the potential gelf sender providers performs a synchronous host lookup at creation time, which in turn results in a UnknownHostException which is being reported from within GelfSenderFactory. The appender is subsequently left without a gelf sender and reports the "Could not send GELF message". The code clearly did not anticipate modern cloud environments with very dynamic dns entries ;) |
DNS lookups are expensive, that's why we decided to do the lookup once during startup. I'm not sure that modern describes something that causes more problems than it solves. Do you have a suggestion how to enable dynamic dns lookups without introducing a performance penalty to users that don't require such a functionality? |
Well, as far as I see it there is no going backwards with regards to cloud environments. No matter our personal opinions on that ;) As far as I understand the code right now there are actually several causes which can leave the appender without a sender, in which case it will never recover from failure. The first decision is between two choices:
I am afraid I am not really qualified to come up with a quick solution however :( |
I experienced a similar problem. We had an entire virtual machine cluster in one of our DCs go down, and apparently the app+gelf logger was started before some other elements had fully recovered. The namelookup failed and was never tried again and we got a continuous stream of "Could not send GELF message" instead, requiring the service to be restarted. |
GelfSender
We can turn the hostname lookup into a warning when it fails. |
GelfSender
We now allow sender creation even if the hostname cannot be resolved during application startup in anticipation that things resolve eventually.
I am running a bunch of microservices in a kubernetes cluster. The microservices are using the log4j2 logstash-gelf appender to emit log events to a remote logstash service.
When the microservices are started before the logstash service is available, the appender never actually tries to reconnect to logstash.
I get one error on stdout at service start:
main ERROR Unknown GELF server hostname:tcp:shared-logstash?readTimeout=10s&connectionTimeout=2000ms&deliveryAttempts=5&keepAlive=true
From then on get the following error for each individual log event:
Log4j2-TF-1-AsyncLoggerConfig-1 ERROR Could not send GELF message
When logstash service is available within the cluster and I restart the microservice at this point, the appender connects and everything works as expected.
However, I would expect the appender to eventually retry connections without having to restart the whole microservice container. What is the expected implemented/specified behavior here?
I am using 1.14.1.
The text was updated successfully, but these errors were encountered: