Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluent-bit can sometimes hang indefinitely on read(...)/getaddrinfo(...) #6329

Closed
ShelbyZ opened this issue Nov 2, 2022 · 2 comments
Closed

Comments

@ShelbyZ
Copy link

ShelbyZ commented Nov 2, 2022

Bug Report

Describe the bug
fluent-bit can sometimes hang indefinitely on read(...)/getaddrinfo(...)

  • Steps to reproduce the problem:
  • Using firelens-datajet to simulate input send data to 4 cloudwatch_logs output plugins build a debug fluent-bit version and enable core dumps
  • Wait anywhere from 30-180min
  • Observe CloudWatch logs are no longer received
  • docker exec into container and inspect that at least 1 thread is waiting on read(...) or getaddrinfo(...)
  • Re-check CloudWatch logs for no further logs

Expected behavior
Network calls should timeout/closed due to error and all application to proceed/resume.

Screenshots
N/A

Your Environment

  • Version used: 2.0.2 (also tested 1.8.15, 1.9.8)
  • Configuration:
    fb-hang.conf
    firelens-datajet.json
  • Environment name and version (e.g. Kubernetes? What version?): Docker
  • Server type and version: EC2
  • Operating System and version:
    x86_64 amzn2-ami-kernel-5.10-hvm-2.0.20221004.0-x86_64-gp2
    ami-0d593311db5abb72b
    arm64 amzn2-ami-kernel-5.10-hvm-2.0.20221004.0-arm64-gp2
    MacOSX - Darwin Kernel Version 21.6.0
    ami-0efabcf945ffd8831
  • Filters and plugins:
    tcp input plugin
    cloudwatch_logs output plugin

Additional context
arm64

  • backtrace1 - Thread4/6/7 blocked on read(...) with SSL_shutdown(...) higher on the stack
  • backtrace2 - Thread5/6 blocked on read(...) with SSL_shutdown(...) higher on the stack, Thread4/Thread7 blocked on getaddrinfo(...)

x86_64

  • backtrace1 - Thread4 blocked on read(...) with SSL_shutdown(...) higher on the stack
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Jan 31, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2023

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant