Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from grafana fluent-bit-plugin-loki to official fluent-bit #3435

Closed
TeddyAndrieux opened this issue Jul 1, 2021 · 0 comments · Fixed by #3709
Closed

Migrate from grafana fluent-bit-plugin-loki to official fluent-bit #3435

TeddyAndrieux opened this issue Jul 1, 2021 · 0 comments · Fixed by #3709
Labels
kind:dependencies Pull requests that update a dependency file state:blocked Something prevents this from being worked on

Comments

@TeddyAndrieux
Copy link
Collaborator

Component:

'logging'

Why this is needed:

Grafana helm chart for fluent-bit-plugin-loki get deprecated as fluent-bit now has a loki output plugin built-in

What should be done:

Migrate to the fluent-bit helm chart from fluent repo (https://github.com/fluent/helm-charts/tree/main/charts/fluent-bit) and use the fluent fluent-bit image instead of fluent-bit-plugin-loki from grafana

Implementation proposal (strongly recommended):

Implementation started on this branch https://github.com/scality/metalk8s/compare/improvement/bump-fluentbit


But it's stuck as it does not work as-is, fluent-bit get stuck and send nothing to Loki until we restart fluent-bit pods.

Not 100% sure but it seems to be linked to the fact that Loki is not available when fluent-bit get deployed initially, it seems to be related to fluent/fluent-bit#3328.

It may need some more investigation or to wait for a new fluent-bit version that may fix the issue we face here.

From fluent-bit logs, we have some logs talking about Loki output and then ... nothing, it's stuck and we have no log at all in Loki

2021-07-01T13:01:14.265497095Z stderr F Fluent Bit v1.7.9
2021-07-01T13:01:14.265516347Z stderr F * Copyright (C) 2019-2021 The Fluent Bit Authors
2021-07-01T13:01:14.265521254Z stderr F * Copyright (C) 2015-2018 Treasure Data
2021-07-01T13:01:14.265524531Z stderr F * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
2021-07-01T13:01:14.265526969Z stderr F * https://fluentbit.io
2021-07-01T13:01:14.265529769Z stderr F 
2021-07-01T13:01:19.657567147Z stderr F [2021/07/01 13:01:19] [ warn] [net] getaddrinfo(host='loki-0'): Unknown error
2021-07-01T13:01:19.657607354Z stderr F [2021/07/01 13:01:19] [error] [output:loki:loki.0] no upstream connections available
2021-07-01T13:01:19.65772706Z stderr F [2021/07/01 13:01:19] [ warn] [net] getaddrinfo(host='loki-0'): Unknown error
2021-07-01T13:01:19.657761059Z stderr F [2021/07/01 13:01:19] [error] [output:loki:loki.1] no upstream connections available
2021-07-01T13:01:19.657768023Z stderr F [2021/07/01 13:01:19] [ warn] [engine] failed to flush chunk '1-1625144474.575399374.flb', retry in 10 seconds: task_id=0, input=tail.0 > output=loki.0 (out_id=0)
2021-07-01T13:01:19.657779313Z stderr F [2021/07/01 13:01:19] [ warn] [engine] failed to flush chunk '1-1625144474.297368732.flb', retry in 11 seconds: task_id=2, input=systemd.1 > output=loki.1 (out_id=1)
2021-07-01T13:01:34.609454146Z stderr F [2021/07/01 13:01:34] [ warn] [net] getaddrinfo(host='loki-0'): Unknown error
2021-07-01T13:01:34.609479541Z stderr F [2021/07/01 13:01:34] [error] [output:loki:loki.0] no upstream connections available
2021-07-01T13:01:34.609485844Z stderr F [2021/07/01 13:01:34] [ warn] [engine] chunk '1-1625144474.575399374.flb' cannot be retried: task_id=0, input=tail.0 > output=loki.0
2021-07-01T13:01:35.601963127Z stderr F [2021/07/01 13:01:35] [ warn] [net] getaddrinfo(host='loki-0'): Unknown error
2021-07-01T13:01:35.602010031Z stderr F [2021/07/01 13:01:35] [error] [output:loki:loki.1] no upstream connections available
2021-07-01T13:01:35.602017377Z stderr F [2021/07/01 13:01:35] [ warn] [engine] chunk '1-1625144474.297368732.flb' cannot be retried: task_id=2, input=systemd.1 > output=loki.1
2021-07-01T13:01:40.601264774Z stderr F [2021/07/01 13:01:40] [ warn] [net] getaddrinfo(host='loki-0'): Unknown error
2021-07-01T13:01:40.601292559Z stderr F [2021/07/01 13:01:40] [error] [output:loki:loki.0] no upstream connections available
2021-07-01T13:01:40.601307303Z stderr F [2021/07/01 13:01:40] [ warn] [engine] failed to flush chunk '1-1625144474.585288604.flb', retry in 8 seconds: task_id=1, input=tail.0 > output=loki.0 (out_id=0)
2021-07-01T13:01:41.608707363Z stderr F [2021/07/01 13:01:41] [ warn] [net] getaddrinfo(host='loki-0'): Unknown error
2021-07-01T13:01:41.608745217Z stderr F [2021/07/01 13:01:41] [error] [output:loki:loki.1] no upstream connections available
2021-07-01T13:01:41.608830107Z stderr F [2021/07/01 13:01:41] [ warn] [engine] failed to flush chunk '1-1625144474.298708935.flb', retry in 6 seconds: task_id=3, input=systemd.1 > output=loki.1 (out_id=1)

If we restart fluent-bit pods then Logs start being sent to Loki

@TeddyAndrieux TeddyAndrieux added state:blocked Something prevents this from being worked on kind:dependencies Pull requests that update a dependency file labels Jul 1, 2021
TeddyAndrieux added a commit that referenced this issue Feb 21, 2022
Migrate from grafana fluent-bit deprecated helm chart to the fluent-bit
helm chart from fluent:

```
rm -rf charts/fluent-bit
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
helm fetch -d charts --untar fluent/fluent-bit
```

Change image from `fluent-bit-loki-plugin` from grafana repository to
1.8.12 `fluent-bit` from fluent repository as Loki output plugin is
built-in.

Migrate fluent-bit option file to work with the new fluent-bit helm
template.

Rewrite fluent-bit output from config as it's not exactly the same
between `grafana-loki` and `loki`.

We also enable HTTP_Server back as it seems to work properly with this
version.

Render chart to salt state using:

```
./charts/render.py fluent-bit --namespace metalk8s-logging \
  charts/fluent-bit.yaml charts/fluent-bit/ \
  > salt/metalk8s/addons/logging/fluent-bit/deployed/chart.sls
```

Fixes: #3435
TeddyAndrieux added a commit that referenced this issue Feb 21, 2022
Migrate from grafana fluent-bit deprecated helm chart to the fluent-bit
helm chart from fluent:

```
rm -rf charts/fluent-bit
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
helm fetch -d charts --untar fluent/fluent-bit
```

Change image from `fluent-bit-loki-plugin` from grafana repository to
1.8.12 `fluent-bit` from fluent repository as Loki output plugin is
built-in.

Migrate fluent-bit option file to work with the new fluent-bit helm
template.

Rewrite fluent-bit output from config as it's not exactly the same
between `grafana-loki` and `loki`.

We also enable HTTP_Server back as it seems to work properly with this
version.

Render chart to salt state using:

```
./charts/render.py fluent-bit --namespace metalk8s-logging \
  charts/fluent-bit.yaml charts/fluent-bit/ \
  > salt/metalk8s/addons/logging/fluent-bit/deployed/chart.sls
```

Fixes: #3435
@bert-e bert-e closed this as completed in f1c0ec0 Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:dependencies Pull requests that update a dependency file state:blocked Something prevents this from being worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant