-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #9295] 100 Million Records in Log file: Read from event FD failed #3029
Comments
Updated by TechIsCool on 2015-05-19 23:50:16 +00:00 The server rebooted and this was cleared so its not a production issue for me I just want it as a bug since logs should not consume the whole drive. |
Updated by mfriedrich on 2015-06-18 11:47:35 +00:00
|
Updated by TechIsCool on 2015-08-19 06:47:52 +00:00 Consumed another 20GB on a different host so this is still a problem. 242,603,597 critical/SocketEvents: Read from event FD failed. |
Updated by TechIsCool on 2015-08-19 06:48:26 +00:00 Still both windows hosts and the one mentioned is running the latest version. |
Updated by gbeutner on 2015-11-14 18:19:16 +00:00
|
Updated by rafael.voss on 2016-02-25 09:03:38 +00:00 this bug appears on my Windows server after the update form 2.4.1 to 2.4.3, when i start "icinga2.exe daemon" from commandline and kill it with ctrl+c. Never happend on 2.4.1
|
Updated by ZianAtFirstWatch on 2016-07-26 22:51:06 +00:00 I also experienced this problem on 2 Windows Server 2012 R2 hosts. They both have the Icinga 2 version 2.4.8 client installed. The main Icinga 2 program runs on a Debian server running Icinga 2. My log file says something like this:
The log is named icinga02.log and I found it at C:\ProgramData\icinga2\var\log\icinga2\icinga2.log I tried restarting the Icinga service on both Windows computers and the problem went away on one of them. After restarting the remaining computer, the problem no longer recurred. |
This problem also occured to some of my Windows Server 2008 R2 with installed icinga2-agent version 2.8.0: [2018-08-13 09:13:38 +0200] critical/SocketEvents: Read from event FD failed. One log was about 7 GB and on another server it was about 13 GB. As a result of this, the C:-Partition run out of disk space. |
I could silence the logging, but actually this is a real error when the FD is gone inside the socket IO thread. I haven't found it yet why this only happens on Windows with |
Hello @t-rex2! Do I understand you right that your Windows agent produces gigs of log with the most messages being the ones you posted? Best, |
Hi Al2Klimov, yes that's right. Kind regards |
This issue seems to have been addressed by #7005. |
Hi @t-rex2, snapshot packages for Windows are available for testing on https://packages.icinga.com Thanks, |
I consider this being resolved. Please test the snapshot packages either way prior to the release not to run into any other pitfalls. |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/9295
Created by TechIsCool on 2015-05-19 23:49:12 +00:00
Assignee: (none)
Status: New
Target Version: (none)
Last Update: 2016-07-26 22:51:06 +00:00 (in Redmine)
So today I had a virtual NFS disk lag due to external issues not related to Icinga, On one of my Windows Servers I experienced an issue where I almost ran out of Disk space. icinga notified me correctly but it was Icinga that was consuming all the space with its log file.
I have a copy of the Data but I don't think anyone wants the 10+GB files so I have trimmed it down to just the errors that occurred with the counts of them. If you want to see a section let me know and I will rip it from the file.
This is from the icinga2.log not debug.log which I would have expected to see this size file.
2 critical/TcpSocket: getaddrinfo() failed with error code 11001, "No such host is known. "
2 information/Application: Received request to shut down.
2 information/Application: Shutting down...
2 information/ConfigItem: Activated all objects.
3 critical/ApiListener: Cannot connect to host 'abydos.domainname.com' on port '5665'
3 information/ApiClient: No messages for identity 'vault.domainname.com' have been received in the last 60 seconds.
8 critical/ThreadPool: Exception thrown in event handler:
8 rogram Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state.tmp' failed with error code 13, 'Permission denied'
9 warning/ApiClient: Error while sending JSON-RPC message for identity 'vault.domainname.com'
11 critical/TcpSocket: Invalid socket: 10061, "No connection could be made because the target machine actively refused it."
11 warning/ApiListener: Removing API client for endpoint 'vault.domainname.com'. 0 API clients left.
12 critical/ApiListener: Cannot connect to host 'vault.domainname.com' on port '5665'
12 information/ApiListener: New client connection for identity 'vault.domainname.com'
12 warning/ApiClient: API client disconnected for identity 'vault.domainname.com'
14 information/ApiClient: Not sending heartbeat for endpoint 'vault.domainname.com' because we're replaying the log for it.
19 warning/ApiClient: API client disconnected for identity 'abydos.domainname.com'
19 warning/ApiClient: Error while sending JSON-RPC message for identity 'abydos.domainname.com'
19 warning/ApiListener: Removing API client for endpoint 'abydos.domainname.com'. 0 API clients left.
23 information/ApiClient: Reconnecting to API endpoint 'vault.domainname.com' via host 'vault.domainname.com' and port '5665'
24 information/ApiListener: New client connection for identity 'abydos.domainname.com'
25 information/ApiClient: Reconnecting to API endpoint 'abydos.domainname.com' via host 'abydos.domainname.com' and port '5665'
825 information/ApiClient: Not sending heartbeat for endpoint 'abydos.domainname.com' because we're replaying the log for it.
2,178 information/DynamicObject: Dumping program state to file 'C:\Program Files (x86)\ICINGA2\var/lib/icinga2/icinga2.state'
108,616,317 critical/SocketEvents: Read from event FD failed.
This log file Starts at 5-12-2015 around 2PM and Ends 5-19-2015 at 3PM
The Error that continually appears started at 5-19-2015 11:45 Ends 5-19-2015 at 3PM
So just in about 3 Hours I had 10GB consumed
Relations:
The text was updated successfully, but these errors were encountered: