-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network failures (?) cause *excessive* logging #2628
Comments
This is the problem described in elastic/elastic-agent-client#66 We have seen this happen rarely but haven't isolated what causes it (besides knowing what causes the excessive logging). |
The logs here indicate the Beat subprocesses are failing their checkin/health check back to the Elastic Agent parent process because of a timeout dialing the agent's gRPC server on port 6789. |
Upgraded Elasticsearch, Kibana and elastic-agent from 8.7.0 to 8.7.1 and got the same errors in the elastic-agent logs:
Agent is also failing to send any message to Elasticsearch. |
Restarting the agent fixed this in one of the other reports of this problem we've had. |
We have an internal support case for this problem now where we've been able to obtain a stack trace of the agent when this is happening. It looks like a deadlock in the agent so far, we are still trying to determine exactly where. |
Issue
I noticed I wasn't getting any data from my
elastic-agent
deployment in my k8s cluster (monitoring GKE, GCP infra, billing, logs from gcp pubsub), when I looked at the logs they were full of messages:I manually restarted the deployment and it all went back to normal. Unfortunately I cannot provide more details because I had to act quick - the agent was spamming more than 17k (!!) messages like these into the log every second. I unfortunately don't have any other logs because of it.
For confirmed bugs, please report:
Binary: 8.7.0 (build: fc4a15b8a56c2ac2d7b878a706937c60e93816f9 at 2023-03-28 04:02:11 +0000 UTC)
Definition of done
The text was updated successfully, but these errors were encountered: