Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metricbeat k8s pod crash on 502 from Prometheus + Linkerd #18897

Closed
psoderberg opened this issue Jun 2, 2020 · 4 comments · Fixed by #19103
Closed

Metricbeat k8s pod crash on 502 from Prometheus + Linkerd #18897

psoderberg opened this issue Jun 2, 2020 · 4 comments · Fixed by #19103
Assignees
Labels
bug Metricbeat Metricbeat Team:Platforms Label for the Integrations - Platforms team

Comments

@psoderberg
Copy link

Fairly built out k8s POC env using 7.7.0. Using using the beats daemonset to pull Prometeus data from the endpoints. The issue comes occurs when one of those pods is crashing. If a pod does not answer an HTTP request, the Linkerd service mesh will reply with a 502, causing the metrcibeat pod to crash with:

2020-05-29T15:56:22.219Z    INFO    module/wrapper.go:259    Error fetching data for metricset prometheus.collector: unable to decode response from prometheus endpoint: unexpected status code 502 from server
panic: close of nil channel

goroutine 2633 [running]:
github.com/elastic/beats/v7/libbeat/common.(*Cache).StopJanitor(...)
    /go/src/github.com/elastic/beats/libbeat/common/cache.go:258
github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/collector.(*counterCache).Stop(0xc002735780)
    /go/src/github.com/elastic/beats/x-pack/metricbeat/module/prometheus/collector/counter.go:94 +0x32
github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/collector.(*typedGenerator).Stop(0xc0027357a0)
    /go/src/github.com/elastic/beats/x-pack/metricbeat/module/prometheus/collector/data.go:57 +0x33
github.com/elastic/beats/v7/metricbeat/module/prometheus/collector.(*MetricSet).Close(0xc0017a9080, 0x5781c20, 0xc0017a9080)
    /go/src/github.com/elastic/beats/metricbeat/module/prometheus/collector/collector.go:189 +0x3a
github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).close(0xc001ab51d0, 0x1, 0xc001d547c0)
    /go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:295 +0x5e
github.com/elastic/beats/v7/metricbeat/mb/module.(*Wrapper).Start.func1(0xc002b4dee0, 0xc00298dc80, 0xc001ba34a0, 0xc001ab51d0)
    /go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:148 +0x2e9
created by github.com/elastic/beats/v7/metricbeat/mb/module.(*Wrapper).Start
    /go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:135 +0x140

Autodiscover config:

metricbeat.autodiscover:
  providers:
    - type: kubernetes
      host: "${NODE_NAME}"
      hosts: ["https://${NODE_NAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      ssl.verification_mode: "none"
      include_annotations: ["prometheus.io.scrape"]
      templates:
        - condition:
            contains:
              kubernetes.annotations.prometheus.io/scrape: "true"
          config:
            - module: prometheus
              metricsets: ["collector"]
              hosts: "${data.host}:${data.port}"
              use_types: true
              rate_counters: true```
@psoderberg psoderberg added Metricbeat Metricbeat Team:Platforms Label for the Integrations - Platforms team labels Jun 2, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@ChrsMark
Copy link
Member

ChrsMark commented Jun 9, 2020

It should be related to rates since I was not able to reproduce with a pure configuration:

# Metrics collected from a Prometheus endpoint
- module: prometheus
  period: 10s
  metricsets: ["collector"]
  hosts: ["xxxx:9100"]
  metrics_path: /metrics

Output

2020-06-09T17:52:46.860+0300	DEBUG	[module]	module/wrapper.go:189	Starting metricSetWrapper[module=prometheus, name=collector, host=xxxxx:9100]
2020-06-09T17:52:47.022+0300	DEBUG	[prometheus.collector]	prometheus/prometheus.go:79	error received from prometheus endpoint: {"cluster_name":"79e30963ce8741e9ac762383cef53410","cluster_uuid":"S94d7ZqQQ-icMFPxuRU3aA","version":{"build_date":"2020-05-12T02:01:37.602180Z","minimum_wire_compatibility_version":"6.8.0","build_hash":"81a1e9eda8e6183f5237786246f6dced26a10eaf","number":"7.7.0","lucene_version":"8.5.1","minimum_index_compatibility_version":"6.0.0-beta1","build_flavor":"default","build_snapshot":false,"build_type":"docker"},"name":"instance-0000000001","tagline":"You Know, for Search"}
2020-06-09T17:52:47.022+0300	INFO	module/wrapper.go:259	Error fetching data for metricset prometheus.collector: unable to decode response from prometheus endpoint: unexpected status code 502 from server
2020-06-09T17:52:47.022+0300	DEBUG	[kubernetes]	add_kubernetes_metadata/kubernetes.go:214	Using the following index key xxxxx:9100	{"libbeat.processor": "add_kubernetes_metadata"}
2020-06-09T17:52:47.022+0300	DEBUG	[kubernetes]	add_kubernetes_metadata/kubernetes.go:217	Index key xxxxx:9100 did not match any of the cached resources	{"libbeat.processor": "add_kubernetes_metadata"}
2020-06-09T17:52:47.023+0300	DEBUG	[processors]	processing/processors.go:187	Publish event: {
  "@timestamp": "2020-06-09T14:52:46.860Z",
  "@metadata": {
    "beat": "metricbeat",
    "type": "_doc",
    "version": "8.0.0"
  },
  "host": {
    "architecture": "x86_64",
    "os": {
      "family": "darwin",
      "name": "Mac OS X",
      "kernel": "18.7.0",
      "build": "18G95",
      "platform": "darwin",
      "version": "10.14.6"
    },
    "name": "Christoss-MacBook-Pro.local",
    "id": "883134FF-0EC4-5E1B-9F9E-FD06FB681D84",
    "ip": [
      "fe80::aede:48ff:fe00:1122",
      "fe80::1c5c:c09b:123a:9188",
      "192.168.1.2",
      "fe80::e0d4:97ff:fe23:c650",
      "fe80::e30f:9605:c8d4:dcb2"
    ],
    "mac": [
      "00:e0:4c:68:47:ed",
      "ac:de:48:00:11:22",
      "a6:83:e7:8c:c8:7b",
      "a4:83:e7:8c:c8:7b",
      "06:83:e7:8c:c8:7b",
      "e2:d4:97:23:c6:50",
      "ca:00:5c:62:81:01",
      "ca:00:5c:62:81:00",
      "ca:00:5c:62:81:05",
      "ca:00:5c:62:81:04",
      "ca:00:5c:62:81:01"
    ],
    "hostname": "Christoss-MacBook-Pro.local"
  },
  "prometheus": {
    "metrics": {
      "up": 0
    },
    "labels": {
      "instance": "xxxx:9100",
      "job": "prometheus"
    }
  },
  "metricset": {
    "period": 10000,
    "name": "collector"
  },
  "service": {
    "address": "xxxxx:9100",
    "type": "prometheus"
  },
  "event": {
    "dataset": "prometheus.collector",
    "module": "prometheus",
    "duration": 162508208
  },
  "agent": {
    "version": "8.0.0",
    "ephemeral_id": "a92760a4-6593-4ddf-b101-22ca972b2269",
    "id": "a97be4e7-e982-40db-98ee-84275dbf506c",
    "name": "Christoss-MacBook-Pro.local",
    "type": "metricbeat"
  },
  "ecs": {
    "version": "1.5.0"
  }
}

@ChrsMark
Copy link
Member

Running in debug mode with Autodiscover, rates enabled and using a mock server as target container which always returns 502 status code. It seems to me that it is mostly an issue with Autodiscover and how the metricset is being closed. It should not be related to the 502 response but in the stop event when the pod goes down. Here is my output in debug level:

2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:237	Stopping 1 configs
2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	cfgfile/list.go:62	Starting reload procedure, current runners: 1
2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	cfgfile/list.go:80	Start list: 0, Stop list: 1
2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	cfgfile/list.go:84	Stopping runner: RunnerGroup{prometheus [metricsets=1]}
2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:165	Got a start event: map[config:[0xc000da8750] host:172.17.0.6 id:66aa54ee-2804-4b6f-94e8-959952f858d3.natsqueue kubernetes:{"annotations":{"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"prometheus.io/scrape2\":\"true\"},\"labels\":{\"app\":\"name\",\"role\":\"main\",\"some\":\"somelabel\"},\"name\":\"nats\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"args\":[\"-m\",\"8222\"],\"command\":[\"/nats-server\"],\"image\":\"nats\",\"name\":\"natsqueue\",\"ports\":[{\"containerPort\":8222,\"name\":\"web\",\"protocol\":\"TCP\"}]}]}}\n"}},"prometheus":{"io/scrape2":"true"}},"container":{"id":"24533f9a8cfe16e3c6bddb46498b9f8a517f1c1c29a84e6ed429752cec511c70","image":"nats","name":"natsqueue","runtime":"docker"},"labels":{"app":"name","role":"main","some":"somelabel"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"nats","uid":"66aa54ee-2804-4b6f-94e8-959952f858d3"}} meta:{"kubernetes":{"container":{"image":"nats","name":"natsqueue"},"labels":{"app":"name","role":"main","some":"somelabel"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"nats","uid":"66aa54ee-2804-4b6f-94e8-959952f858d3"}}} port:8222 provider:bb8744ce-a8cd-4e51-92ce-e2dda9d5ce39 start:true]
2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:190	Generated config: map[hosts:[34.78.17.139:9100] metrics_path:/metrics module:prometheus period:10s rate_counters:true use_types:true]
2020-06-10T09:20:09.557Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:252	Got a meta field in the event
2020-06-10T09:20:09.559Z	DEBUG	[autodiscover]	cfgfile/list.go:62	Starting reload procedure, current runners: 0
2020-06-10T09:20:09.559Z	DEBUG	[autodiscover]	cfgfile/list.go:80	Start list: 1, Stop list: 0
2020-06-10T09:20:09.559Z	DEBUG	[autodiscover]	cfgfile/list.go:101	Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-06-10T09:20:09.559Z	DEBUG	[module]	module/wrapper.go:127	Starting Wrapper[name=prometheus, len(metricSetWrappers)=1]
2020-06-10T09:20:09.559Z	DEBUG	[publisher]	pipeline/client.go:162	client: closing acker
2020-06-10T09:20:09.559Z	DEBUG	[publisher]	pipeline/client.go:173	client: done closing acker
2020-06-10T09:20:09.559Z	DEBUG	[publisher]	pipeline/client.go:176	client: cancelled 0 events
2020-06-10T09:20:09.559Z	DEBUG	[publisher]	pipeline/client.go:147	client: wait for acker to finish
2020-06-10T09:20:09.559Z	DEBUG	[publisher]	pipeline/client.go:149	client: acker shut down
2020-06-10T09:20:09.559Z	DEBUG	[module]	module/wrapper.go:214	Stopped metricSetWrapper[module=prometheus, name=collector, host=34.78.17.139:9100]
2020-06-10T09:20:09.559Z	DEBUG	[module]	module/wrapper.go:155	Stopped Wrapper[name=prometheus, len(metricSetWrappers)=1]
2020-06-10T09:20:09.559Z	DEBUG	[module]	module/wrapper.go:181	prometheus/collector will start after 1.627066252s
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover.pod]	kubernetes/pod.go:145	Watcher Pod update: &Pod{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:nats,GenerateName:,Namespace:default,SelfLink:/api/v1/namespaces/default/pods/nats,UID:66aa54ee-2804-4b6f-94e8-959952f858d3,ResourceVersion:3245,Generation:0,CreationTimestamp:2020-06-10 09:19:18 +0000 UTC,DeletionTimestamp:2020-06-10 09:20:39 +0000 UTC,DeletionGracePeriodSeconds:*30,Labels:map[string]string{app: name,role: main,some: somelabel,},Annotations:map[string]string{kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{"prometheus.io/scrape2":"true"},"labels":{"app":"name","role":"main","some":"somelabel"},"name":"nats","namespace":"default"},"spec":{"containers":[{"args":["-m","8222"],"command":["/nats-server"],"image":"nats","name":"natsqueue","ports":[{"containerPort":8222,"name":"web","protocol":"TCP"}]}]}}
,prometheus.io/scrape2: true,},OwnerReferences:[],Finalizers:[],ClusterName:,ManagedFields:[],},Spec:PodSpec{Volumes:[{default-token-5slmg {nil nil nil nil nil SecretVolumeSource{SecretName:default-token-5slmg,Items:[],DefaultMode:*420,Optional:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}],Containers:[{natsqueue nats [/nats-server] [-m 8222]  [{web 0 8222 TCP }] [] [] {map[] map[]} [{default-token-5slmg true /var/run/secrets/kubernetes.io/serviceaccount  <nil> }] [] nil nil nil /dev/termination-log File Always nil false false false}],RestartPolicy:Always,TerminationGracePeriodSeconds:*30,ActiveDeadlineSeconds:nil,DNSPolicy:ClusterFirst,NodeSelector:map[string]string{},ServiceAccountName:default,DeprecatedServiceAccount:default,NodeName:minikube,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:&PodSecurityContext{SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,SupplementalGroups:[],FSGroup:nil,RunAsGroup:nil,Sysctls:[],WindowsOptions:nil,},ImagePullSecrets:[],Hostname:,Subdomain:,Affinity:nil,SchedulerName:default-scheduler,InitContainers:[],AutomountServiceAccountToken:nil,Tolerations:[{node.kubernetes.io/not-ready Exists  NoExecute 0xc000165ef0} {node.kubernetes.io/unreachable Exists  NoExecute 0xc000165f10}],HostAliases:[],PriorityClassName:,Priority:*0,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[],RuntimeClassName:nil,EnableServiceLinks:*true,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[],EphemeralContainers:[],},Status:PodStatus{Phase:Running,Conditions:[{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-06-10 09:19:18 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2020-06-10 09:20:10 +0000 UTC ContainersNotReady containers with unready status: [natsqueue]} {ContainersReady False 0001-01-01 00:00:00 +0000 UTC 2020-06-10 09:20:10 +0000 UTC ContainersNotReady containers with unready status: [natsqueue]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-06-10 09:19:18 +0000 UTC  }],Message:,Reason:,HostIP:192.168.64.10,PodIP:172.17.0.6,StartTime:2020-06-10 09:19:18 +0000 UTC,ContainerStatuses:[{natsqueue {nil nil ContainerStateTerminated{ExitCode:2,Signal:0,Reason:Error,Message:,StartedAt:2020-06-10 09:19:22 +0000 UTC,FinishedAt:2020-06-10 09:20:09 +0000 UTC,ContainerID:docker://24533f9a8cfe16e3c6bddb46498b9f8a517f1c1c29a84e6ed429752cec511c70,}} {nil nil nil} false 0 nats:latest docker-pullable://nats@sha256:f73ca674fc4d375c67d10577c68ab8317f3fabd1cd14c8c36b5bd99c7caee9f6 docker://24533f9a8cfe16e3c6bddb46498b9f8a517f1c1c29a84e6ed429752cec511c70}],QOSClass:BestEffort,InitContainerStatuses:[],NominatedNodeName:,PodIPs:[{172.17.0.6}],EphemeralContainerStatuses:[],},}
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover.bus-metricbeat]	bus/bus.go:88	map[config:[0xc000e0f590] host:172.17.0.6 id:66aa54ee-2804-4b6f-94e8-959952f858d3.natsqueue kubernetes:{"annotations":{"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"prometheus.io/scrape2\":\"true\"},\"labels\":{\"app\":\"name\",\"role\":\"main\",\"some\":\"somelabel\"},\"name\":\"nats\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"args\":[\"-m\",\"8222\"],\"command\":[\"/nats-server\"],\"image\":\"nats\",\"name\":\"natsqueue\",\"ports\":[{\"containerPort\":8222,\"name\":\"web\",\"protocol\":\"TCP\"}]}]}}\n"}},"prometheus":{"io/scrape2":"true"}},"container":{"id":"24533f9a8cfe16e3c6bddb46498b9f8a517f1c1c29a84e6ed429752cec511c70","image":"nats","name":"natsqueue","runtime":"docker"},"labels":{"app":"name","role":"main","some":"somelabel"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"nats","uid":"66aa54ee-2804-4b6f-94e8-959952f858d3"}} meta:{"kubernetes":{"container":{"image":"nats","name":"natsqueue"},"labels":{"app":"name","role":"main","some":"somelabel"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"nats","uid":"66aa54ee-2804-4b6f-94e8-959952f858d3"}}} port:8222 provider:bb8744ce-a8cd-4e51-92ce-e2dda9d5ce39 stop:true]	{"libbeat.bus": "metricbeat"}
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:229	Got a stop event: map[config:[0xc000e0f590] host:172.17.0.6 id:66aa54ee-2804-4b6f-94e8-959952f858d3.natsqueue kubernetes:{"annotations":{"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"prometheus.io/scrape2\":\"true\"},\"labels\":{\"app\":\"name\",\"role\":\"main\",\"some\":\"somelabel\"},\"name\":\"nats\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"args\":[\"-m\",\"8222\"],\"command\":[\"/nats-server\"],\"image\":\"nats\",\"name\":\"natsqueue\",\"ports\":[{\"containerPort\":8222,\"name\":\"web\",\"protocol\":\"TCP\"}]}]}}\n"}},"prometheus":{"io/scrape2":"true"}},"container":{"id":"24533f9a8cfe16e3c6bddb46498b9f8a517f1c1c29a84e6ed429752cec511c70","image":"nats","name":"natsqueue","runtime":"docker"},"labels":{"app":"name","role":"main","some":"somelabel"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"nats","uid":"66aa54ee-2804-4b6f-94e8-959952f858d3"}} meta:{"kubernetes":{"container":{"image":"nats","name":"natsqueue"},"labels":{"app":"name","role":"main","some":"somelabel"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"nats","uid":"66aa54ee-2804-4b6f-94e8-959952f858d3"}}} port:8222 provider:bb8744ce-a8cd-4e51-92ce-e2dda9d5ce39 stop:true]
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:237	Stopping 1 configs
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover]	cfgfile/list.go:62	Starting reload procedure, current runners: 1
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover]	cfgfile/list.go:80	Start list: 0, Stop list: 1
2020-06-10T09:20:10.612Z	DEBUG	[autodiscover]	cfgfile/list.go:84	Stopping runner: RunnerGroup{prometheus [metricsets=1]}
2020-06-10T09:20:10.612Z	DEBUG	[publisher]	pipeline/client.go:162	client: closing acker
2020-06-10T09:20:10.613Z	DEBUG	[publisher]	pipeline/client.go:173	client: done closing acker
2020-06-10T09:20:10.613Z	DEBUG	[publisher]	pipeline/client.go:176	client: cancelled 0 events
2020-06-10T09:20:10.613Z	DEBUG	[publisher]	pipeline/client.go:147	client: wait for acker to finish
2020-06-10T09:20:10.613Z	DEBUG	[publisher]	pipeline/client.go:149	client: acker shut down
panic: close of nil channel

goroutine 1316 [running]:
github.com/elastic/beats/v7/libbeat/common.(*Cache).StopJanitor(...)
	/go/src/github.com/elastic/beats/libbeat/common/cache.go:258
github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/collector.(*counterCache).Stop(0xc000dec100)
	/go/src/github.com/elastic/beats/x-pack/metricbeat/module/prometheus/collector/counter.go:94 +0x32
github.com/elastic/beats/v7/x-pack/metricbeat/module/prometheus/collector.(*typedGenerator).Stop(0xc000dec120)
	/go/src/github.com/elastic/beats/x-pack/metricbeat/module/prometheus/collector/data.go:57 +0x33
github.com/elastic/beats/v7/metricbeat/module/prometheus/collector.(*MetricSet).Close(0xc000df0000, 0x5781c20, 0xc000df0000)
	/go/src/github.com/elastic/beats/metricbeat/module/prometheus/collector/collector.go:189 +0x3a
github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).close(0xc000dee180, 0x6c6c616369707974, 0x6469766f72702079)
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:295 +0x5e
github.com/elastic/beats/v7/metricbeat/mb/module.(*Wrapper).Start.func1(0xc000d1dde0, 0xc000972f60, 0xc000daf500, 0xc000dee180)
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:148 +0x2e9
created by github.com/elastic/beats/v7/metricbeat/mb/module.(*Wrapper).Start
	/go/src/github.com/elastic/beats/metricbeat/mb/module/wrapper.go:135 +0x140

@ChrsMark
Copy link
Member

It seems that here we have a race-condition related to autodiscover stop/start events. To cut the long story short, here the counterCache is about to get stopped when it is not started yet maybe because of some delay on starting the runner (2020-06-10T10:58:28.791Z DEBUG [module] module/wrapper.go:181 prometheus/collector will start after 5.180689998s).

See more at proposed fix: https://github.com/elastic/beats/pull/19103/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Metricbeat Metricbeat Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants