Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested values in otel logs are dropped #5458

Closed
Yunzlez opened this issue Sep 27, 2024 · 1 comment · Fixed by #5485
Closed

Nested values in otel logs are dropped #5458

Yunzlez opened this issue Sep 27, 2024 · 1 comment · Fixed by #5485
Assignees
Labels
bug Something isn't working

Comments

@Yunzlez
Copy link

Yunzlez commented Sep 27, 2024

Describe the bug
When a field in a log entry from otel collector contains nested JSON, the field is omitted in quickwit v0.8.2

As a workaround for #5311 (which is not released yet), we've changed our app's logging to print JSON to stdout, e.g.:

{"timestamp":"2024-09-27T13:42:10.337702Z","level":"DEBUG","message":"No job retrieved","target":"tasks::job_task"}

We then use opentelemetry collector to parse this JSON and enrich the message by using the json_parser operator on the filelog receiver - here's a snippet from the otelcol config:

operators:             
- id: container-parser
  max_log_size: 102400
  type: container
- default: noop
  id: select-parser
  routes:
  - expr: body startsWith "{"
    output: parser-json
  type: router
- id: parser-json
  parse_to: body
  severity:
    parse_from: body.level
  timestamp:
    layout: '%Y-%m-%dT%H:%M:%S.%fZ'
    parse_from: body.timestamp
  type: json_parser
- type: noop

According to otelcol's debug exporter, this does what we want (only relevant line for brevity):

  • The body is replaced by a map containing all fields in the original JSON message.
...                                                                                                                                   
Body: Map({"level":"DEBUG","message":"No job retrieved","target":"tasks::job_task","timestamp":"2024-09-27T14:04:50.363544Z"})                                                         
...

Expected behavior
The body should show up when queried in quickwit, but this is the result:

{
  "attributes": {
    "log.file.path": "/var/log/pods/.../0.log",
    "log.iostream": "stdout",
    "logtag": "F"
  },
  "observed_timestamp_nanos": 1727438690431348000,
  "resource_attributes": {
    "k8s.container.name": "...",
    "k8s.container.restart_count": "0",
    "k8s.namespace.name": "...",
    "k8s.node.name": "...",
    "k8s.pod.name": "...",
    "k8s.pod.start_time": "...",
    "k8s.pod.uid": "...",
    "level": "DEBUG"
  },
  "service_name": "unknown_service",
  "severity_number": 5,
  "severity_text": "DEBUG",
  "timestamp_nanos": 1727438690366247000,
  "trace_flags": 0
}

The same thing happens when a nested JSON field appears in any of the other fields (like resource_attributes) - the entry is dropped.

Configuration:
Please provide:

  1. Output of quickwit --version:
    Quickwit 0.8.2 (x86_64-unknown-linux-gnu 2024-06-17T16:36:47Z 42766b8)

  2. The index_config.yaml:
    I tried with 2 distinct indexes: One that shares config with the otel-logs-v0_7 index:

{
  "field_mappings": [
    {
      "name": "timestamp_nanos",
      "type": "datetime",
      "fast": true,
      "fast_precision": "milliseconds",
      "indexed": false,
      "input_formats": [
        "unix_timestamp"
      ],
      "output_format": "unix_timestamp_nanos",
      "stored": true
    },
    {
      "name": "observed_timestamp_nanos",
      "type": "datetime",
      "fast": false,
      "fast_precision": "seconds",
      "indexed": true,
      "input_formats": [
        "unix_timestamp"
      ],
      "output_format": "unix_timestamp_nanos",
      "stored": true
    },
    {
      "name": "service_name",
      "type": "text",
      "fast": {
        "normalizer": "raw"
      },
      "fieldnorms": false,
      "indexed": true,
      "record": "basic",
      "stored": true,
      "tokenizer": "raw"
    },
    {
      "name": "severity_text",
      "type": "text",
      "fast": {
        "normalizer": "raw"
      },
      "fieldnorms": false,
      "indexed": true,
      "record": "basic",
      "stored": true,
      "tokenizer": "raw"
    },
    {
      "name": "severity_number",
      "type": "u64",
      "coerce": true,
      "fast": true,
      "indexed": true,
      "output_format": "number",
      "stored": true
    },
    {
      "name": "body",
      "type": "json",
      "expand_dots": true,
      "fast": false,
      "indexed": true,
      "record": "basic",
      "stored": true,
      "tokenizer": "default"
    },
    {
      "name": "attributes",
      "type": "json",
      "expand_dots": true,
      "fast": {
        "normalizer": "raw"
      },
      "indexed": true,
      "record": "basic",
      "stored": true,
      "tokenizer": "raw"
    },
    {
      "name": "dropped_attributes_count",
      "type": "u64",
      "coerce": true,
      "fast": false,
      "indexed": false,
      "output_format": "number",
      "stored": true
    },
    {
      "name": "trace_id",
      "type": "bytes",
      "fast": false,
      "indexed": true,
      "input_format": "hex",
      "output_format": "hex",
      "stored": true
    },
    {
      "name": "span_id",
      "type": "bytes",
      "fast": false,
      "indexed": true,
      "input_format": "hex",
      "output_format": "hex",
      "stored": true
    },
    {
      "name": "trace_flags",
      "type": "u64",
      "coerce": true,
      "fast": false,
      "indexed": false,
      "output_format": "number",
      "stored": true
    },
    {
      "name": "resource_attributes",
      "type": "json",
      "expand_dots": true,
      "fast": {
        "normalizer": "raw"
      },
      "indexed": true,
      "record": "basic",
      "stored": true,
      "tokenizer": "raw"
    },
    {
      "name": "resource_dropped_attributes_count",
      "type": "u64",
      "coerce": true,
      "fast": false,
      "indexed": false,
      "output_format": "number",
      "stored": true
    },
    {
      "name": "scope_name",
      "type": "text",
      "fast": false,
      "fieldnorms": false,
      "indexed": false,
      "stored": true
    },
    {
      "name": "scope_version",
      "type": "text",
      "fast": false,
      "fieldnorms": false,
      "indexed": false,
      "stored": true
    },
    {
      "name": "scope_attributes",
      "type": "json",
      "expand_dots": true,
      "fast": false,
      "indexed": false,
      "stored": true
    },
    {
      "name": "scope_dropped_attributes_count",
      "type": "u64",
      "coerce": true,
      "fast": false,
      "indexed": false,
      "output_format": "number",
      "stored": true
    }
  ],
  "tag_fields": [],
  "store_source": false,
  "index_field_presence": false,
  "timestamp_field": "timestamp_nanos",
  "mode": "strict",
  "max_num_partitions": 200,
  "tokenizers": []
}

And one with full dynamic mapping:

{
  "field_mappings": [],
  "tag_fields": [],
  "store_source": false,
  "index_field_presence": false,
  "timestamp_field": null,
  "mode": "dynamic",
  "dynamic_mapping": {
    "indexed": true,
    "tokenizer": "raw",
    "record": "basic",
    "stored": true,
    "expand_dots": true,
    "fast": {
      "normalizer": "raw"
    }
  },
  "max_num_partitions": 200,
  "tokenizers": []
}
@Yunzlez Yunzlez added the bug Something isn't working label Sep 27, 2024
@guilload guilload self-assigned this Oct 9, 2024
@guilload guilload moved this to In progress in Quickwit 0.9 Oct 9, 2024
@guilload
Copy link
Member

Thanks for flagging this bug. I'm working on a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants