Skip to content
This repository has been archived by the owner on Jun 29, 2023. It is now read-only.

Allow TCP and UDP sender creation for unresolvable hostnames #252

Closed
MarkLehmacherInvestify opened this issue Oct 28, 2020 · 5 comments
Closed
Labels
type: enhancement A general enhancement
Milestone

Comments

@MarkLehmacherInvestify
Copy link

MarkLehmacherInvestify commented Oct 28, 2020

I am running a bunch of microservices in a kubernetes cluster. The microservices are using the log4j2 logstash-gelf appender to emit log events to a remote logstash service.

When the microservices are started before the logstash service is available, the appender never actually tries to reconnect to logstash.

I get one error on stdout at service start:
main ERROR Unknown GELF server hostname:tcp:shared-logstash?readTimeout=10s&connectionTimeout=2000ms&deliveryAttempts=5&keepAlive=true

From then on get the following error for each individual log event:
Log4j2-TF-1-AsyncLoggerConfig-1 ERROR Could not send GELF message

When logstash service is available within the cluster and I restart the microservice at this point, the appender connects and everything works as expected.

However, I would expect the appender to eventually retry connections without having to restart the whole microservice container. What is the expected implemented/specified behavior here?

I am using 1.14.1.

@MarkLehmacherInvestify
Copy link
Author

Apparently at least one the potential gelf sender providers performs a synchronous host lookup at creation time, which in turn results in a UnknownHostException which is being reported from within GelfSenderFactory. The appender is subsequently left without a gelf sender and reports the "Could not send GELF message".

The code clearly did not anticipate modern cloud environments with very dynamic dns entries ;)

@mp911de
Copy link
Owner

mp911de commented Oct 28, 2020

DNS lookups are expensive, that's why we decided to do the lookup once during startup. I'm not sure that modern describes something that causes more problems than it solves.

Do you have a suggestion how to enable dynamic dns lookups without introducing a performance penalty to users that don't require such a functionality?

@MarkLehmacherInvestify
Copy link
Author

Well, as far as I see it there is no going backwards with regards to cloud environments. No matter our personal opinions on that ;)

As far as I understand the code right now there are actually several causes which can leave the appender without a sender, in which case it will never recover from failure. The first decision is between two choices:

  1. avoid having the appender end up without sender, this probably means deferring the creation of the socket (I am looking at the tcp code) until "later"
  2. make sure the appender has some way to recover from a missing sender

I am afraid I am not really qualified to come up with a quick solution however :(

@hartman
Copy link

hartman commented Jun 16, 2021

I experienced a similar problem. We had an entire virtual machine cluster in one of our DCs go down, and apparently the app+gelf logger was started before some other elements had fully recovered. The namelookup failed and was never tried again and we got a continuous stream of "Could not send GELF message" instead, requiring the service to be restarted.

@mp911de mp911de changed the title log4j2 appender does not seem to recover connection Failed hostname lookup prevents sender creation and leaves appender in a state without a GelfSender Jan 21, 2022
@mp911de mp911de added the type: enhancement A general enhancement label Jan 21, 2022
@mp911de
Copy link
Owner

mp911de commented Jan 21, 2022

We can turn the hostname lookup into a warning when it fails.

@mp911de mp911de added this to the 1.15.0 milestone Jan 21, 2022
@mp911de mp911de changed the title Failed hostname lookup prevents sender creation and leaves appender in a state without a GelfSender Allow TCP and UDP sender creation for unresolvable hostnames Jan 21, 2022
mp911de added a commit that referenced this issue Jan 21, 2022
We now allow sender creation even if the hostname cannot be resolved during application startup in anticipation that things resolve eventually.
@mp911de mp911de closed this as completed Jan 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

3 participants