Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta fields are not handled consistently in different processors #25425

Closed
jsoriano opened this issue Apr 29, 2021 · 2 comments · Fixed by #30183
Closed

Meta fields are not handled consistently in different processors #25425

jsoriano opened this issue Apr 29, 2021 · 2 comments · Fixed by #30183
Assignees
Labels
good first issue Indicates a good issue for first-time contributors Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team v8.1.0

Comments

@jsoriano
Copy link
Member

TLDR; All processors should use event.PutValue when setting fields to have a consistent behaviour.

Some fields set in events are intended to provide information to the pipelines, for example @timestamp to set the timestamp of the event or @metadata._id to set its id. This is not being handled in a consistent way between processors.

For example when the id is set using the fingerprint processor, it works as expected:

  - fingerprint:
      fields: ["id"]
      target_field: "@metadata._id"

It doesn't set @metadata._id in the fields of the event, but it uses it as the id of the generated document.

On the other hand, if add_fields is used to set one of these special fields, the field is set in the event and not handled by the pipeline:

  - add_fields:
      target: "@metadata"
      fields:
        op_type: "index"

In this case the index bulk operation should be used, but it isn't, and @metadata.op_type is set as a field in the event.

The cause of this difference is that some processors use the event.PutValue method, that handles these cases, and other processors use lower level mechanisms as event.Fields.Put, or directly events.Fields[key] = value.

All processors should use event.PutValue when setting fields to have a consistent behaviour.

For confirmed bugs, please report:

  • Version: All versions at least till 7.12.
  • Steps to Reproduce: The following configuration should create index bulk requests so documents are updated when they have the same fingerprint, but instead create bulk requests are used and then the documents are not updated:
processors:
  - fingerprint:
      fields: ["id"]
      target_field: "@metadata._id"
  - add_fields:
      target: "@metadata"
      fields:
        op_type: "index"

As a workaround, expected result can be achieved setting @metadata.op_type with the convert processor, that uses PutValue:

processors:
  - fingerprint:
      fields: ["id"]
      target_field: "@metadata._id"
  - add_fields:
      target: "_tmp"
      fields:
        op_type: "index"
  - convert:
      fields:
      - from: "_tmp.op_type"
        to: "@metadata.op_type"
        type: "string"
  - drop_fields.fields:
    - "_tmp"

Related:

@jsoriano jsoriano added the Team:Elastic-Agent Label for the Agent team label Apr 29, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@jsoriano jsoriano added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Oct 29, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Indicates a good issue for first-time contributors Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team v8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants