Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-for-fluent-bit runs into endless "mem buf overlimit" after Kubernetes Upgrade 1.24 #863

Open
VictorW96 opened this issue Dec 19, 2022 · 1 comment · May be fixed by #1168
Open

aws-for-fluent-bit runs into endless "mem buf overlimit" after Kubernetes Upgrade 1.24 #863

VictorW96 opened this issue Dec 19, 2022 · 1 comment · May be fixed by #1168
Labels
bug Something isn't working

Comments

@VictorW96
Copy link

Describe the bug
aws-for-fluent-bit runs into endless "mem buf overlimit" after Kubernetes Upgrade 1.24 from Kubernete 1.21.

[2022/12/19 07:54:37] [debug] [input:tail:tail.0] inode=32524322 events: IN_MODIFY
[2022/12/19 07:54:37] [ warn] [input] tail.0 paused (mem buf overlimit) 
[2022/12/19 07:54:38] [debug] [input:tail:tail.0] inode=32524322 events: IN_MODIFY

Steps to reproduce

  • Install chart version 0.1.11 in EKS 1.21. This worked for me.
  • Upgrade the kubernetes version.
  • Try to upgrade the chart version >0.1.15 in EKS >1.22 ( earlier chart versions won't work because of old kubernetes APIs)
  • See the error:
[2022/12/19 07:54:37] [debug] [input:tail:tail.0] inode=32524322 events: IN_MODIFY
[2022/12/19 07:54:37] [ warn] [input] tail.0 paused (mem buf overlimit) 
[2022/12/19 07:54:38] [debug] [input:tail:tail.0] inode=32524322 events: IN_MODIFY

Expected outcome
Fluent-bit should send logs to cloudwatch as before the upgrade.

Environment

  • Chart name: aws-for-fluent-bit
  • Chart version: 0.1.15
  • Kubernetes version: 1.24
  • Using EKS (yes/no), if so version? yes

Additional Context:

I already tried to increase the the membuff limit as described in fluent/fluent-bit#1903:

    resources:
      requests:
        cpu: "250m"
        memory: "1000Mi"
      limits:
        cpu: "250m"
        memory: "1000Mi"
    env:
      - name: FLB_LOG_LEVEL
        value: "debug"
      - name: HOST_NAME
        valueFrom:
          fieldRef:
            fieldPath: spec.nodeName
    extraInputs: |
      [INPUT]
          Name                tail
          Tag                 application.*
          Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/aws-for-fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
          Path                /var/log/containers/*.log
          Docker_Mode         On
          Docker_Mode_Flush   5
          Docker_Mode_Parser  container_firstline
          Parser              docker
          DB                  /var/fluent-bit/state/flb_container.db
          Mem_Buf_Limit       1000MB
          Skip_Long_Lines     On
          Refresh_Interval    10
          Rotate_Wait         30
          Read_from_Head      Off

but this does not seem to change anything.

@VictorW96 VictorW96 added the bug Something isn't working label Dec 19, 2022
@VictorW96
Copy link
Author

For anyone finding this thread. I fixed it by removing the helm chart and using this daemonset deployment: https://github.com/aws-samples/amazon-cloudwatch-container-insights/blob/master/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/fluent-bit/fluent-bit.yaml. This is the one from the official AWS documentation: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-logs-FluentBit.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant