Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Add more context in kibana rule execution traces #113506

Closed
cyrille-leclerc opened this issue Sep 30, 2021 · 1 comment · Fixed by #117504
Closed

[Alerting] Add more context in kibana rule execution traces #113506

cyrille-leclerc opened this issue Sep 30, 2021 · 1 comment · Fixed by #117504
Assignees
Labels
apm:rac Feature:Observability RAC Feature:RAC label obsolete Team:APM All issues that need APM UI Team support Theme: rac label obsolete

Comments

@cyrille-leclerc
Copy link
Contributor

cyrille-leclerc commented Sep 30, 2021

Feature Description

Add more context to the traces and spans captured on alert rule executions.

  • Transaction name

    • Actual: rule type identifier (e.g. alerting:apm.transaction_duration)
    • Expected
      • Contains a human readable identifier of the rule
      • Prefixed by a verb referring to the concept of alerting rule execution
      • Idea: Execute Alerting Rule '((rule.name))'
        • e.g. Execute Alerting Rule 'Latency threshold 250ms | frontend'
  • Transaction attributes, we should have most of the kibana.alert.rule.* fields of the alert data stored in internal.alerts-* indices (e.g. .internal.alerts-observability.apm.alerts-default-000001)

     {
     "kibana.alert.rule.category": "Latency threshold",
     "kibana.alert.rule.consumer": "apm",
     "kibana.alert.rule.name": "Latency threshold 250ms | frontend",
     "kibana.alert.rule.producer": "apm",
     "kibana.alert.rule.rule_type_id": "apm.transaction_duration",
     "kibana.alert.rule.uuid": "c2538e30-210d-11ec-843e-b59bd8a73aeb",
     "tags": ["apm", "service.name:frontend"]
     }
    
  • Rule execution outcome: is it a violation or not

  • If it's a violation, then details on the associated alert

    {
    "kibana.alert.instance.id": "apm.transaction_duration_All",
    "kibana.alert.uuid": "c0548f62-d4b7-475c-9675-205b9d28be9e",
    }
    
  • If possible some alert details like service.name: frontend captured by the apm.transaction_duration rule type.

image

Details:

Kibana Rule Execution Transaction Document
{
  "_index": "apm-7.15.0-transaction-000001",
  "_type": "_doc",
  "_id": "REwmMnwB5clgtAisemQi",
  "_version": 1,
  "_score": 1,
  "_source": {
    "parent": {
      "id": "05f1c08e9cba13ee"
    },
    "agent": {
      "name": "nodejs",
      "version": "3.16.0"
    },
    "process": {
      "args": [
        "/usr/local/Cellar/kibana-full/7.15.0/libexec/node/bin/node",
        "/usr/local/Cellar/kibana-full/7.15.0/libexec/src/cli/dist"
      ],
      "pid": 17727,
      "title": "/usr/local/Cellar/kibana-full/7.15.0/libexec/node/bin/node",
      "ppid": 17724
    },
    "processor": {
      "name": "transaction",
      "event": "transaction"
    },
    "labels": {
      "kibana_uuid": "3c2ea5bb-06f3-4aee-9d37-8f15f483e21e",
      "deploymentId": "cyrille-localhost",
      "git_rev": "add5d2c5ebeba1d8bcf6a79f8863cd78760e1b3e"
    },
    "observer": {
      "hostname": "MacBook-Pro.localdomain",
      "id": "c6806dda-7615-4d01-bc3f-aeb7ca9aa2f2",
      "type": "apm-server",
      "ephemeral_id": "0b6a14ed-c886-482b-a1fa-5a921a1d53d6",
      "version": "7.15.0",
      "version_major": 7
    },
    "trace": {
      "id": "0ee94daa373e8907b68d9b7bd449faa4"
    },
    "@timestamp": "2021-09-29T15:22:23.333Z",
    "ecs": {
      "version": "1.11.0"
    },
    "service": {
      "node": {
        "name": "MacBook-Pro.localdomain"
      },
      "environment": "production",
      "framework": {
        "name": "hapi",
        "version": "20.0.3"
      },
      "name": "kibana",
      "runtime": {
        "name": "node",
        "version": "14.17.6"
      },
      "language": {
        "name": "javascript"
      },
      "version": "7.15.0"
    },
    "host": {
      "hostname": "MacBook-Pro.localdomain",
      "os": {
        "platform": "darwin"
      },
      "ip": "127.0.0.1",
      "name": "MacBook-Pro.localdomain",
      "architecture": "x64"
    },
    "event": {
      "ingested": "2021-09-29T15:22:34.877174Z",
      "outcome": "unknown"
    },
    "transaction": {
      "duration": {
        "us": 4510436
      },
      "result": "success",
      "name": "alerting:apm.transaction_duration",
      "id": "0e83b50a27db8606",
      "span_count": {
        "started": 31
      },
      "type": "taskManager run",
      "sampled": true
    },
    "timestamp": {
      "us": 1632928943333000
    }
  },
  "fields": {
    "transaction.name.text": [
      "alerting:apm.transaction_duration"
    ],
    "service.framework.version": [
      "20.0.3"
    ],
    "labels.git_rev": [
      "add5d2c5ebeba1d8bcf6a79f8863cd78760e1b3e"
    ],
    "service.node.name": [
      "MacBook-Pro.localdomain"
    ],
    "host.hostname": [
      "MacBook-Pro.localdomain"
    ],
    "process.pid": [
      17727
    ],
    "service.language.name": [
      "javascript"
    ],
    "transaction.result": [
      "success"
    ],
    "transaction.sampled": [
      true
    ],
    "transaction.id": [
      "0e83b50a27db8606"
    ],
    "host.ip": [
      "127.0.0.1"
    ],
    "trace.id": [
      "0ee94daa373e8907b68d9b7bd449faa4"
    ],
    "labels.kibana_uuid": [
      "3c2ea5bb-06f3-4aee-9d37-8f15f483e21e"
    ],
    "processor.event": [
      "transaction"
    ],
    "agent.name": [
      "nodejs"
    ],
    "host.name": [
      "MacBook-Pro.localdomain"
    ],
    "labels.deploymentId": [
      "cyrille-localhost"
    ],
    "event.outcome": [
      "unknown"
    ],
    "service.environment": [
      "production"
    ],
    "service.name": [
      "kibana"
    ],
    "service.framework.name": [
      "hapi"
    ],
    "process.ppid": [
      17724
    ],
    "service.runtime.name": [
      "node"
    ],
    "processor.name": [
      "transaction"
    ],
    "transaction.duration.us": [
      4510436
    ],
    "process.args": [
      "/usr/local/Cellar/kibana-full/7.15.0/libexec/node/bin/node",
      "/usr/local/Cellar/kibana-full/7.15.0/libexec/src/cli/dist"
    ],
    "service.runtime.version": [
      "14.17.6"
    ],
    "observer.version_major": [
      7
    ],
    "observer.hostname": [
      "MacBook-Pro.localdomain"
    ],
    "transaction.type": [
      "taskManager run"
    ],
    "host.architecture": [
      "x64"
    ],
    "transaction.span_count.started": [
      31
    ],
    "observer.id": [
      "c6806dda-7615-4d01-bc3f-aeb7ca9aa2f2"
    ],
    "timestamp.us": [
      1632928943333000
    ],
    "event.ingested": [
      "2021-09-29T15:22:34.877Z"
    ],
    "@timestamp": [
      "2021-09-29T15:22:23.333Z"
    ],
    "service.version": [
      "7.15.0"
    ],
    "observer.ephemeral_id": [
      "0b6a14ed-c886-482b-a1fa-5a921a1d53d6"
    ],
    "observer.version": [
      "7.15.0"
    ],
    "host.os.platform": [
      "darwin"
    ],
    "ecs.version": [
      "1.11.0"
    ],
    "observer.type": [
      "apm-server"
    ],
    "transaction.name": [
      "alerting:apm.transaction_duration"
    ],
    "parent.id": [
      "05f1c08e9cba13ee"
    ],
    "agent.version": [
      "3.16.0"
    ],
    "process.title": [
      "/usr/local/Cellar/kibana-full/7.15.0/libexec/node/bin/node"
    ]
  }
}
Alert data
{
  "_index": ".internal.alerts-observability.apm.alerts-default-000001",
  "_type": "_doc",
  "_id": "c0548f62-d4b7-475c-9675-205b9d28be9e",
  "_version": 443,
  "_score": 1,
  "_source": {
    "kibana.alert.status": "recovered",
    "kibana.alert.rule.producer": "apm",
    "kibana.alert.rule.rule_type_id": "apm.transaction_duration",
    "kibana.alert.evaluation.value": 285006.13,
    "kibana.alert.instance.id": "apm.transaction_duration_All",
    "kibana.alert.rule.name": "Latency threshold 250ms | frontend",
    "kibana.alert.end": "2021-09-30T07:23:30.693Z",
    "event.kind": "signal",
    "kibana.alert.workflow_status": "open",
    "kibana.alert.rule.uuid": "c2538e30-210d-11ec-843e-b59bd8a73aeb",
    "kibana.alert.reason": "Latency is above 250 μs (current value is 285 ms) for frontend",
    "kibana.alert.rule.consumer": "apm",
    "tags": [
      "apm",
      "service.name:frontend"
    ],
    "kibana.alert.rule.category": "Latency threshold",
    "kibana.alert.start": "2021-09-29T10:12:37.592Z",
    "event.action": "close",
    "kibana.alert.duration.us": 76253101000,
    "@timestamp": "2021-09-30T07:23:30.693Z",
    "kibana.alert.uuid": "c0548f62-d4b7-475c-9675-205b9d28be9e",
    "kibana.space_ids": [
      "default"
    ],
    "kibana.version": "7.15.0",
    "kibana.alert.evaluation.threshold": 250,
    "processor.event": [
      "transaction"
    ],
    "service.name": [
      "frontend"
    ],
    "transaction.type": [
      "request"
    ]
  },
  "fields": {
    "kibana.alert.status": [
      "recovered"
    ],
    "kibana.alert.rule.producer": [
      "apm"
    ],
    "kibana.alert.rule.rule_type_id": [
      "apm.transaction_duration"
    ],
    "kibana.alert.evaluation.value": [
      285006.13
    ],
    "kibana.alert.instance.id": [
      "apm.transaction_duration_All"
    ],
    "processor.event": [
      "transaction"
    ],
    "kibana.alert.rule.name": [
      "Latency threshold 250ms | frontend"
    ],
    "kibana.alert.end": [
      "2021-09-30T07:23:30.693Z"
    ],
    "event.kind": [
      "signal"
    ],
    "kibana.alert.workflow_status": [
      "open"
    ],
    "service.name": [
      "frontend"
    ],
    "kibana.alert.rule.uuid": [
      "c2538e30-210d-11ec-843e-b59bd8a73aeb"
    ],
    "kibana.alert.reason": [
      "Latency is above 250 μs (current value is 285 ms) for frontend"
    ],
    "kibana.alert.rule.consumer": [
      "apm"
    ],
    "tags": [
      "apm",
      "service.name:frontend"
    ],
    "transaction.type": [
      "request"
    ],
    "kibana.alert.rule.category": [
      "Latency threshold"
    ],
    "kibana.alert.start": [
      "2021-09-29T10:12:37.592Z"
    ],
    "event.action": [
      "close"
    ],
    "kibana.alert.duration.us": [
      76253101000
    ],
    "@timestamp": [
      "2021-09-30T07:23:30.693Z"
    ],
    "kibana.alert.uuid": [
      "c0548f62-d4b7-475c-9675-205b9d28be9e"
    ],
    "kibana.space_ids": [
      "default"
    ],
    "kibana.version": [
      "7.15.0"
    ],
    "kibana.alert.evaluation.threshold": [
      250
    ]
  }
}

Describe a specific use case for the feature

Troubleshoot the alerting system being able to slice and dice in any dimension

@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 30, 2021
@dgieselaar dgieselaar added the Team:APM All issues that need APM UI Team support label Sep 30, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@dgieselaar dgieselaar self-assigned this Sep 30, 2021
@botelastic botelastic bot removed the needs-team Issues missing a team label label Sep 30, 2021
dgieselaar added a commit to dgieselaar/kibana that referenced this issue Nov 4, 2021
@zube zube bot added [zube]: 8.0 and removed [zube]: Backlog labels Nov 8, 2021
@cyrille-leclerc cyrille-leclerc changed the title Add more context in kibana rule execution traces [Alerting] Add more context in kibana rule execution traces Nov 10, 2021
@zube zube bot removed the [zube]: Done label Feb 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:rac Feature:Observability RAC Feature:RAC label obsolete Team:APM All issues that need APM UI Team support Theme: rac label obsolete
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants