Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stack Monitoring] Support for error documents #4011

Closed
Tracked by #120415
klacabane opened this issue Aug 17, 2022 · 1 comment · Fixed by elastic/kibana#140102
Closed
Tracked by #120415

[Stack Monitoring] Support for error documents #4011

klacabane opened this issue Aug 17, 2022 · 1 comment · Fixed by elastic/kibana#140102
Assignees
Labels
Integration:elasticsearch Elasticsearch Integration:kibana Kibana Integration:logstash Logstash Team:Infra Monitoring UI - DEPRECATED Label for the Infrastructure Monitoring UI team. - DEPRECATED - Use Team:obs-ux-infra_services v8.5.0

Comments

@klacabane
Copy link
Contributor

klacabane commented Aug 17, 2022

Summary

When agent fails to collect metrics from the stack packages an error.message field will be populated with the reason, and the document will be indexed in the corresponding data stream like metrics-elasticsearch.stack_monitoring.cluster_stats.

We should look into adding support for these documents so we can surface them in the Health api (already added), and also verify that error documents stored in regular indices do not break queries.

Two initial options proposed:

  • continue to store them in the regular data streams - we'll have to map error.message in all datastreams. Note that the error documents may be fetched by a query and cause issues ? We don't really know because metricbeat stores these error documents separately from the monitoring indices so we never fetch them
  • store the error documents in a separate data stream, with an ingest pipeline ? that would replicate the model we have with metricbeat today: legit data in .monitoring-* and error data in metricbeat-*

With the agent/package setup, error docs land in the same data streams so we'll work on adjusting queries to account for that (first option).

error document example
{
  "_index": ".ds-metrics-kibana.status-default-2022.08.17-000001",
  "_id": "hDRJqYIBWbiCmF3CyhDZ",
  "_version": 1,
  "_score": 0,
  "_source": {
    "agent": {
      "name": "docker-fleet-agent",
      "id": "a8c3a385-b8d6-4538-aa4a-6ca467b710b6",
      "type": "metricbeat",
      "ephemeral_id": "9305a77e-685f-4706-8e1b-629e849665ff",
      "version": "8.5.0"
    },
    "@timestamp": "2022-08-17T00:52:41.373Z",
    "ecs": {
      "version": "8.0.0"
    },
    "data_stream": {
      "namespace": "default",
      "type": "metrics",
      "dataset": "kibana.status"
    },
    "service": {
      "address": "https://kibana:5603/api/status",
      "type": "kibana"
    },
    "host": {
      "hostname": "docker-fleet-agent",
      "os": {
        "kernel": "5.10.47-linuxkit",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.4 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "ip": [
        "172.24.0.6"
      ],
      "containerized": true,
      "name": "docker-fleet-agent",
      "mac": [
        "02:42:ac:18:00:06"
      ],
      "architecture": "x86_64"
    },
    "elastic_agent": {
      "id": "a8c3a385-b8d6-4538-aa4a-6ca467b710b6",
      "version": "8.5.0",
      "snapshot": true
    },
    "metricset": {
      "period": 10000,
      "name": "status"
    },
    "error": {
      "message": "error making http request: Get \"https://kibana:5603/api/status\": dial tcp 172.24.0.5:5603: connect: connection refused"
    },
    "event": {
      "duration": 1657000,
      "agent_id_status": "verified",
      "ingested": "2022-08-17T00:52:41Z",
      "module": "kibana",
      "dataset": "kibana.status"
    }
  },
  "fields": {
    "elastic_agent.version": [
      "8.5.0"
    ],
    "host.hostname": [
      "docker-fleet-agent"
    ],
    "host.mac": [
      "02:42:ac:18:00:06"
    ],
    "service.type": [
      "kibana"
    ],
    "host.ip": [
      "172.24.0.6"
    ],
    "agent.type": [
      "metricbeat"
    ],
    "event.module": [
      "kibana"
    ],
    "host.os.version": [
      "20.04.4 LTS (Focal Fossa)"
    ],
    "host.os.kernel": [
      "5.10.47-linuxkit"
    ],
    "host.os.name": [
      "Ubuntu"
    ],
    "agent.name": [
      "docker-fleet-agent"
    ],
    "host.name": [
      "docker-fleet-agent"
    ],
    "elastic_agent.snapshot": [
      true
    ],
    "event.agent_id_status": [
      "verified"
    ],
    "host.os.type": [
      "linux"
    ],
    "elastic_agent.id": [
      "a8c3a385-b8d6-4538-aa4a-6ca467b710b6"
    ],
    "data_stream.namespace": [
      "default"
    ],
    "metricset.period": [
      10000
    ],
    "host.os.codename": [
      "focal"
    ],
    "data_stream.type": [
      "metrics"
    ],
    "host.architecture": [
      "x86_64"
    ],
    "metricset.name": [
      "status"
    ],
    "event.duration": [
      1657000
    ],
    "event.ingested": [
      "2022-08-17T00:52:41.000Z"
    ],
    "@timestamp": [
      "2022-08-17T00:52:41.373Z"
    ],
    "agent.id": [
      "a8c3a385-b8d6-4538-aa4a-6ca467b710b6"
    ],
    "ecs.version": [
      "8.0.0"
    ],
    "host.os.platform": [
      "ubuntu"
    ],
    "host.containerized": [
      true
    ],
    "service.address": [
      "https://kibana:5603/api/status"
    ],
    "error.message": [
      "error making http request: Get \"https://kibana:5603/api/status\": dial tcp 172.24.0.5:5603: connect: connection refused"
    ],
    "data_stream.dataset": [
      "kibana.status"
    ],
    "agent.ephemeral_id": [
      "9305a77e-685f-4706-8e1b-629e849665ff"
    ],
    "agent.version": [
      "8.5.0"
    ],
    "host.os.family": [
      "debian"
    ],
    "event.dataset": [
      "kibana.status"
    ]
  }
}
@matschaffer
Copy link
Contributor

matschaffer commented Sep 7, 2022

Considering this in progress since @klacabane and I started some work on it in the scope of package testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:elasticsearch Elasticsearch Integration:kibana Kibana Integration:logstash Logstash Team:Infra Monitoring UI - DEPRECATED Label for the Infrastructure Monitoring UI team. - DEPRECATED - Use Team:obs-ux-infra_services v8.5.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants