Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash Exporter keeps crashing #314

Open
luisfelipegarcia opened this issue Mar 12, 2024 · 13 comments
Open

Logstash Exporter keeps crashing #314

luisfelipegarcia opened this issue Mar 12, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@luisfelipegarcia
Copy link

Description of the Issue

Running v1.6.3 (ARM)
Logstash 8.12.0
Ubuntu 22.04

I have noticed that logstash exporter keeps crashing. The logs show a "fatal error: schedule: holding locks" message.
Restarting the service brings the application back up, but it eventually runs into the same issue.

Version of logstash-exporter, or logstash-exporter Image

v1.6.3

Version of Chart (if applicable)

No response

Operating System/Environment

Ubuntu 22.04

Logs

Mar 05 23:55:55 ip-172-31-0-158 logstash-exporter[1360]: time=2024-03-05T23:55:55.645Z level=INFO msg="starting server on" host="" port=9>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: fatal error: schedule: holding locks
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: panic during panic
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime stack:
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.throw({0x49a888, 0x17})
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/panic.go:1023 +0x4c fp=0x>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.schedule()
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/proc.go:3843 +0x2fc fp=0x>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.goschedImpl(0x1484c68, 0x1)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/proc.go:4065 +0x198 fp=0x>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.gopreempt_m(...)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/proc.go:4082
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.newstack()
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/stack.go:1070 +0x3b0 fp=0>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.morestack()
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/asm_arm.s:383 +0x60 fp=0x>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: goroutine 1 gp=0x1402128 m=nil [IO wait, 59 minutes]:
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.gopark(0x4b7118, 0xf7870f08, 0x2, 0x2, 0x5)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/proc.go:402 +0x104 fp=0x1>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: runtime.netpollblock(0xf7870ef8, 0x72, 0x0)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/netpoll.go:573 +0x100 fp=>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: internal/poll.runtime_pollWait(0xf7870ef8, 0x72)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/runtime/netpoll.go:345 +0x54 fp=0>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: internal/poll.(*pollDesc).wait(0x14181a8, 0x72, 0x0)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/internal/poll/fd_poll_runtime.go:>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: internal/poll.(*pollDesc).waitRead(...)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/internal/poll/fd_poll_runtime.go:>
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]: internal/poll.(*FD).Accept(0x1418190)
Mar 06 00:56:16 ip-172-31-0-158 logstash-exporter[1360]:         /opt/hostedtoolcache/go/1.22.0/x64/src/internal/poll/fd_unix.go:611 +0x2>
@luisfelipegarcia luisfelipegarcia added the bug Something isn't working label Mar 12, 2024
@kuskoman
Copy link
Owner

hello @luisfelipegarcia
are you able to provide any kind of example replicating this behaviour?
since the error is concurrency-related it may be very hard to debug

@luisfelipegarcia
Copy link
Author

@kuskoman , I don't have any example to replicate it. It just dies for some reason. This is on a low volume instance of logstash at that. Are there any other logs I can provide to help pinpoint the issue?

@kuskoman
Copy link
Owner

@luisfelipegarcia i wonder if this error is more likely to happen in arm linux build. i will try to test that

for now, since it does not seem like an easy fix it may be open for a while.
could you please check if the error is the same when using the latest prerelease?

@luisfelipegarcia
Copy link
Author

I can do that. I'll test and post an update here.

@luisfelipegarcia
Copy link
Author

@kuskoman , I tried running the latest prerelease but am running into the same issue as #300

time=2024-03-13T19:14:28.458Z level=ERROR msg="executor failed" name=nodestats duration=55.712598ms err="json: cannot unmarshal number 17179869184 into Go struct field .jvm.mem.heap_committed_in_bytes of type int" time=2024-03-13T19:15:28.456Z level=ERROR msg="executor failed" name=nodestats duration=54.499492ms err="json: cannot unmarshal number 17179869184 into Go struct field .jvm.mem.heap_committed_in_bytes of type int"

@kuskoman
Copy link
Owner

@luisfelipegarcia seems like i missed it in v2. i will fix v2 and tell you to check it again later on

@luisfelipegarcia
Copy link
Author

@kuskoman , any update on this?

@kuskoman
Copy link
Owner

Thanks for the reminder, I created a PR to fix the datatypes
I decided to drop handling uint values, because some values may hold -1 and I don't want to check each and every one

@kuskoman
Copy link
Owner

@luisfelipegarcia check v2.0.0-pre6

@luisfelipegarcia
Copy link
Author

I just tested v2.0.0-pre6. Unfortunately I am still getting the "cannot unmarshal number..." errors

time=2024-03-19T18:34:47.192Z level=ERROR msg="executor failed" name=nodestats duration=198.623779ms err="json: cannot unmarshal number 2923200586 into Go struct field .pipelines.events.out of type int" time=2024-03-19T18:35:47.299Z level=ERROR msg="executor failed" name=nodestats duration=306.281444ms err="json: cannot unmarshal number 2924965461 into Go struct field .pipelines.events.out of type int" time=2024-03-19T18:36:47.150Z level=ERROR msg="executor failed" name=nodestats duration=157.919356ms err="json: cannot unmarshal number 2926774586 into Go struct field .pipelines.events.out of type int" time=2024-03-19T18:37:47.169Z level=ERROR msg="executor failed" name=nodestats duration=176.531401ms err="json: cannot unmarshal number 2928573836 into Go struct field .pipelines.events.out of type int"

@kuskoman
Copy link
Owner

@luisfelipegarcia well, i did not expect this particular stat to be that big
could you check v2.0.0-pre7?

@luisfelipegarcia
Copy link
Author

@kuskoman , same issue: "cannot unmarshal...", with pre7

...# ./logstash-exporter-linux-arm --config config.yml time=2024-03-21T15:31:41.329Z level=WARN msg="failed to load .env file" error="open .env: no such file or directory" time=2024-03-21T15:31:41.330Z level=INFO msg="Version: unknown, SemanticVersion: unknown, GitCommit: unknown, GoVersion: go1.22.1, BuildArch: arm, BuildOS: linux, BuildDate: unknown" time=2024-03-21T15:31:41.330Z level=INFO msg="starting server on" host=0.0.0.0 port=9198 time=2024-03-21T15:31:47.071Z level=ERROR msg="executor failed" name=nodestats duration=76.210261ms err="json: cannot unmarshal number 8035345869 into Go struct field .pipelines.plugins.inputs.events.out of type int" time=2024-03-21T15:32:47.072Z level=ERROR msg="executor failed" name=nodestats duration=78.437605ms err="json: cannot unmarshal number 8037152956 into Go struct field .pipelines.plugins.inputs.events.out of type int" time=2024-03-21T15:33:47.134Z level=ERROR msg="executor failed" name=nodestats duration=140.536317ms err="json: cannot unmarshal number 8039006157 into Go struct field .pipelines.plugins.inputs.events.out of type int"

@kuskoman
Copy link
Owner

@luisfelipegarcia would you be able to provide censored dump from nodestats endpoint from logstash, so I can look which metrics are potentially grow fast enough to overflow 32 bit integer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants