Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add replace_fields config option in add_host_metadata for replacing host fields #20490

Merged
merged 12 commits into from
Aug 14, 2020
Merged

Add replace_fields config option in add_host_metadata for replacing host fields #20490

merged 12 commits into from
Aug 14, 2020

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Aug 7, 2020

What does this PR do?

This PR adds replace_fields config option in add_host_metadata for replacing host fields. By default, replace_fields equals to true to make sure this PR is not a breaking change.

Why is it important?

We are adding common host fields into several modules in Metricbeat, such as host.id and host.name. But add_host_metadata processor would overwrite host.id field without this PR.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

This PR should work for any module/metricset. I'm using aws module as an example here:

  1. Make sure Add host inventory metrics to ec2 metricset #20171 is in the code and then enable aws module by
    ./metricbeat modules enable aws
  2. Change aws config modules.d/aws.yml to:
- module: aws
  period: 5m
  credential_profile_name: elastic-beats
  metricsets:
    - ec2
  1. Make sure in metricbeat.yml, add_host_metadata processor is enabled and replace_fields is set to false:
processors:
  - add_host_metadata:
      replace_fields: false
  1. Start Metricbeat with ./metricbeat -e
  2. Check metrics sent to ES from Kibana discover. Metrics should have host.* fields only from ec2 metricset but not any fields from add_host_metadata such as host.hostname, host.architecture, host.os.* etc.

Related issues

#20464

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 7, 2020
@kaiyan-sheng kaiyan-sheng self-assigned this Aug 7, 2020
@kaiyan-sheng kaiyan-sheng added in progress Pull request is currently in progress. Team:Platforms Label for the Integrations - Platforms team labels Aug 7, 2020
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 7, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Aug 7, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #20490 updated]

  • Start Time: 2020-08-14T16:50:49.857+0000

  • Duration: 65 min 24 sec

Test stats 🧪

Test Results
Failed 0
Passed 18002
Skipped 1827
Total 19829

@kaiyan-sheng kaiyan-sheng marked this pull request as ready for review August 10, 2020 17:40
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@kaiyan-sheng kaiyan-sheng added needs_backport PR is waiting to be backported to other branches. review test-plan Add this PR to be manual test plan and removed in progress Pull request is currently in progress. labels Aug 10, 2020
@kaiyan-sheng kaiyan-sheng requested a review from ChrsMark August 11, 2020 18:12
@kaiyan-sheng
Copy link
Contributor Author

@jsoriano I consider this is a breaking change. Is it ok to add this in next release?

@jsoriano
Copy link
Member

@jsoriano I consider this is a breaking change. Is it ok to add this in next release?

@kaiyan-sheng I agree that this is a breaking change, but it seems needed for the inventory efforts, and I think that for the cases we will be breaking it will be more a fix than a breaking change 🙂

Apart of "breaking" the case of add_host_metadata overwriting fields added by modules, is there any other case that you think we would be affecting with this change?

Something else we can do to mitigate possible problems is to add a replace_fields option to add_host_metadata, that defaults to false, and can be set to true to have the previous behaviour.

We can also add, in the note you are adding in add_host_metadata documentation, a line about using add_observer_metadata instead of add_host_metadata if the beat is being used to monitor external systems.

@kaiyan-sheng
Copy link
Contributor Author

Apart of "breaking" the case of add_host_metadata overwriting fields added by modules, is there any other case that you think we would be affecting with this change?

Hmm not what I can think of. I will post it here if I see any other case.

Something else we can do to mitigate possible problems is to add a replace_fields option to add_host_metadata, that defaults to false, and can be set to true to have the previous behaviour.

Good point! I added replace_host_fields config parameter for this and set default as false. If we want to keep this change not a breaking change, we can set default as true. WDYT? @jsoriano

My thought is, we can keep it default as true right now to match the previous/original behavior and then switch the default to false in 8.0.

We can also add, in the note you are adding in add_host_metadata documentation, a line about using add_observer_metadata instead of add_host_metadata if the beat is being used to monitor external systems.

Done. Thank you!

@jsoriano
Copy link
Member

Good point! I added replace_host_fields config parameter for this and set default as false. If we want to keep this change not a breaking change, we can set default as true. WDYT? @jsoriano

My thought is, we can keep it default as true right now to match the previous/original behavior and then switch the default to false in 8.0.

Ok, let's avoid making it a breaking change by now, we can still reconsider this before 7.10. I was convinced that it was better to make the breaking change, and I still think it would be for Metricbeat now that we are adding more logic in modules for the host inventory. But then I have thought on Filebeat and I am not so sure. There are probably SIEM use cases where log parsing is adding host information, but additional network information is still wanted. And also, we are adding this processor by default in the Kubernetes manifests for Filebeat.

Btw (naming is complicated), I find having host in replace_host_fields a bit redundant. Wdyt about naming it replace_fields?

@kaiyan-sheng
Copy link
Contributor Author

Ok, let's avoid making it a breaking change by now, we can still reconsider this before 7.10. I was convinced that it was better to make the breaking change, and I still think it would be for Metricbeat now that we are adding more logic in modules for the host inventory. But then I have thought on Filebeat and I am not so sure. There are probably SIEM use cases where log parsing is adding host information, but additional network information is still wanted. And also, we are adding this processor by default in the Kubernetes manifests for Filebeat.

Done. replace_fields config defaults to true for now to keep the same behavior.
@andrewkroh Do you have concerns about this? TIA!

Btw (naming is complicated), I find having host in replace_host_fields a bit redundant. Wdyt about naming it replace_fields?

Done, yeah I was thinking replace_host_fields is more clear but it is a bit redundant.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, added some nitpicking comments.

We can decide to make it a breaking change later.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generator errors in CI have been fixed in master by #20586. The other errors are flaky ones.

@kaiyan-sheng kaiyan-sheng changed the title Change add_host_metadata to not overwrite if host fields already exist Add replace_fields config option in add_host_metadata for replacing host fields Aug 14, 2020
@kaiyan-sheng
Copy link
Contributor Author

Merging this PR. If we decide to change replace_fields default to false, we can create a separate PR later.

@kaiyan-sheng kaiyan-sheng merged commit 7bb31e6 into elastic:master Aug 14, 2020
@kaiyan-sheng kaiyan-sheng deleted the host_name branch August 14, 2020 18:19
@kaiyan-sheng kaiyan-sheng added v7.10.0 and removed needs_backport PR is waiting to be backported to other branches. labels Aug 14, 2020
kaiyan-sheng added a commit that referenced this pull request Aug 17, 2020
…ost fields (#20490) (#20617)

* add replace_host_fields config param

(cherry picked from commit 7bb31e6)
return false
}

hostFieldsMap := hostFields.(common.MapStr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if the host is not a common.MapStr. Since Beats like Filebeat can handle arbitrary data it should be defensive (for example see

func tryToMapStr(v interface{}) (common.MapStr, bool) {
switch m := v.(type) {
case common.MapStr:
return m, true
case map[string]interface{}:
return common.MapStr(m), true
default:
return nil, false
}
}
).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andrewkroh for pointing this out! I will create a separate PR to fix this issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the PR: #20791

@andresrc andresrc added the test-plan-added This PR has been added to the test plan label Oct 3, 2020
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review Team:Platforms Label for the Integrations - Platforms team test-plan Add this PR to be manual test plan test-plan-added This PR has been added to the test plan v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants