-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/kafkametricsreceiver] collector crashes if Kafka is unavailable at startup #8349
Comments
I believe this is a duplicate of #4752 and I'm therefore closing this one. If you don't think this is the same issue, feel free to reopen. |
These issues are for different components. #4752 appears to be an issued filed on the Kafka Exporter. This report is for the Kafka Metrics Reciever. |
After looking into this further there are two solutions that I see (and there may be others). Background The kafkametricsreceiver defines three scrapers (broker scraper, consumer scraper, topic scraper) and uses the scraperhelper (from the collector core repo) to manage them. The scraperhelper constructs a scraper controller that manages multiple scrapers. Its start method calls start on each of the individual scrapers. If a scraper returns an error from its start method, it bubbles up and the collector fails to start. We can fix this by logging errors instead of returning them, but ideally the receivers would periodically try to start until the services they monitor are up. Fix in the Kafka Metrics receiver A scraper only needs to define a scrape method. They can also define a start method but are not required to. The three scrapers provided in the kafka metrics receiver currently define start methods, but we could rename them to something like Fix in the Scraper Helper We could fix this by changing the behavior of the This change would have an impact on the receivers that use it. Currently that includes: apachereceiver, couchdbreceiver, dockerstatsreceiver, elasticsearchreceiver, googlecloudspannereceiver, hostmetricsreceiver, kafkametricsreceiver, kubeletstatsreceiver, memcachedreceiver, mogodbatlasreceiver, mongodbreceiver, mysqlreceiver, nginxreceiver, podmanreceiver, postgresqlreceiver, rabbitmqreceiver, redisreceiver, windowsperfcountersreceiver, zookeeperreceiver. We'd want to ensure that the new behavior would be acceptable for the receivers that currently use the scraper helper. Next steps? I wanted to a discussion going to see what approach we should take to improve the behavior of the kafkametricsreceiver, and potentially other receivers, depending on which route we take. There might be other options worth considering, so if anyone has other ideas, please feel free to suggest them. P.S |
Sorry for missing this notification. Thanks @dmitryax for reopening! |
This was addressed in #8817. |
Describe the bug
If Kafka is not available when the kafka metrics receiver attempts to start, the collector fails to start.
Steps to reproduce
Configure the kafka metrics receiver and start the collector without a running Kafka instance. Alternatively, you can use this docker-compose example
If using the example, do the following:
What did you expect to see?
I expected a warning at the bare minimum, although, ideally, the receiver would try to reconnect with a backoff strategy.
What did you see instead?
The collector exits with an error. Specifically, I saw this:
What version did you use?
v0.46.0
What config did you use?
Environment
The "official" collector contrib docker image
The text was updated successfully, but these errors were encountered: