-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loki.process of logs from loki.source.windowsevent with event ID 4364 lead to crash #2616
Comments
based on this investigation: |
do you confirm that without the loki.process component (with the loki.process.windowsevent directly connected to the loki.relabel component) there is no crash? |
Hello @wildum , I now removed all metrics collection because always when alloy crashed this caused malformed labels and metrics. This is what I use now: To collect windows event log:
And this here for windows_event forwarded logs:
The last step is the loki.write.loki
I STILL have these crashes. We have several computers (AD Servers) which forward their events to my server. it could be that some logs are out of order. Thats why I asked you in another post how alloy handles this. Loki itself should ingest it if out of order and ingest old logs is enabled. As a test I restored my default config, with all metrics, logs and relabel BUT I removed all other servers from my subscription except one. If there is only one the logs should arrive in the correct order and this is to test if the unordered logs could have caused these problems. @l-freund ---- edit 2025-02-05 22:56 -----
--- edit 2025-02-05 23:47 ---------
|
Hello! My Environment looks as follows:
Especially during setup, timestamps where mixed up because the machines didn't apply the policy all at the same time and started one by one to push logs. Also, windows clients are offline sometimes (e.g. notebooks out of office) and push their logs as soon, as they have connectivity to our network again. In my experience, alloy does not get confused by this but ingests the logs when they arrive at the collector. When use_incoming_timestamp is set, the timestamp is taken from the log entry. This leads to loki providing the logs in chronological order, not in the order they came in. |
Hi, I found something interesting: influxdata/telegraf#12328 That's a telegraph issue but in Alloy we use code from Promtail which forked the win_eventlog code for Telegraph. The crash described in the issue looks a lot like what you encountered and this seems to be solved via these changes: influxdata/telegraf#12375 "The root cause is that Windows' EvtFormatMessage syscall is expecting a handle to the publisher (i.e. the machine that sent the event) which is becoming invalid if that publisher is down. As a consequence Windows throws an exception (read Golang panic) instead of returning a simple error." correlates with what I saw in the crash logs that you sent me The code between Telegraph and Alloy is now quite different but I will try to incorporate the same changes and build a version that you can test. (I don't have the setup to replicate it though) |
Hello, 12hrs later No Crash of alloy. I am forwarding all Log Channels and additional Channels to my ForwardedEvents logs. The amount of logs is probably > 20 Millionen per 12h. Will Monitor some more time and collect pprof profiles from time to time and share. PS: eventmessage= could be from my loki.process step. Will let you know later. |
Hello, Version was working for more than 24hrs. No Crash. |
What's wrong?
Hello,
using alloy 1.5.1 and 1.7.0 preview
both versions lead to crashes in my environment onwindows server 2022.
I receive forwrdedEvents on this server and collect these with alloys loki.source.windowsevent.
I collect the local logs, too.
There ist at least one processing problem with events of ID 4364.
These events leads to a crash of alloy - no matter if in local "Security" or in "ForwardedEvenets".
Here is my loki.process
and as an attachment an anonymized event log:
4634.txt
Steps to reproduce
try to process these typ of messages using my loki.process stage with FORWARDED EVENTS Logs
System information
Windows Server 2022
Software version
alloy 1.5.1 and alloy 1.7.0-preview
Configuration
Logs
The text was updated successfully, but these errors were encountered: