Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for Beats issue with default_field growing too big #687

Merged
merged 9 commits into from
Dec 23, 2019

Conversation

adriansr
Copy link
Contributor

@adriansr adriansr commented Dec 10, 2019

This updates the Beats generator to undestand the default_field option and add it to the generated yaml, as well as update the fields added by 1.3.0 to include the default_field tag.

This change was discussed in elastic/beats#14844

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this addition. This is a "breaking" change to users which used this for the template but not for the fields itself. So I think I agree that it is only an addition.

@webmat I didn't check the fields that were marked with false, you might have an opinion here. Otherwise good to go.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this toggle being driven from the ECS side for now. I'd like to have a few adjustments made before merging, however.

  • Since this is Beats specific, I would like the attribute to reflect this. Perhaps simply beats_default_fields, or if there's a possibility of requiring more Beats-specific toggles in the future, perhaps simply nest this under beats:. I'm fine with both.
  • Please document the meaning of this new attribute in schemas/README.md

Also kind of a higher level question. If I understand this correctly, this is meant to limit the amount of fields that are added to the beat templates' default_field setting? If that's the case, wouldn't you want to adjust this per Beats, instead of overall? Concrete example: default_fields:false on TLS seems detrimental for Packetbeat and Filebeat (with Suricata and Zeek modules), but seems to make sense for Metricbeat. Same goes for default_fields:false on process, which seems detrimental for Auditbeat in particular.

CHANGELOG.next.md Outdated Show resolved Hide resolved
@webmat
Copy link
Contributor

webmat commented Dec 10, 2019

Note that I'm happy to adjust the generators a bit more around this as well. For example, generating 1 output file per Beat instead of a global one.

adriansr and others added 5 commits December 13, 2019 14:42
This updates the Beats generator to undestand the `default_field`
option and add it to the generated yaml, as well as update the fields
added by 1.3.0 to include the `default_field` tag.
Co-Authored-By: Mathieu Martin <[email protected]>
This is so that Beats' default_fields don't go above 1000 field limit.
@adriansr adriansr force-pushed the beats_default_fields branch from 4c514ac to 2bbebc0 Compare December 13, 2019 13:47
@adriansr adriansr requested review from ruflin and webmat December 13, 2019 14:27
@adriansr
Copy link
Contributor Author

@webmat I think I've addressed all your comments. Now it's using beats.default_field and the documentation can be found in schemas/README.md.

Your higher level suggestions make sense but those would require a bigger refactor in how Beats integrates fields from ECS.

@webmat
Copy link
Contributor

webmat commented Dec 16, 2019

Yeah I understand why this is easier for now.

I'm not clear on the semantics, however. This PR adds beats.default_fields: false on 25 fields or so. Is the expectation that all further fields added to ECS, after 1.2 always have beats.default_fields: false?

@andrewkroh
Copy link
Member

andrewkroh commented Dec 19, 2019

With the assumption that this is more or a temporary fix, I would like do this in a way that the files under schema/ never need to have changes. This way ECS contributors don't need to be burdened with understanding the effect of setting default_field or even what Beats are.

This could be achieved by having a snapshot of the current fields we allow for Beats to go into default_field. Then in the generator for the Beat's fields.yml we "whitelist" those fields and for any other fields not in that list we give them default_field: false.

@webmat
Copy link
Contributor

webmat commented Dec 20, 2019

Thanks @andrewkroh, picking this up now.

Are you ok with me wording this (e.g. changelog, schema/readme, etc) specifically as a temporary workaround for Beats 7.6?

@webmat
Copy link
Contributor

webmat commented Dec 20, 2019

Actually I'd put that wording in only for changelog. Because I agree with your assessment that this shouldn't burden ECS contributors, so I'll remove the readme mention.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion with @andrewkroh, here's my thinking on this fix.

This is a temporary workaround. Leaving this as it is now would effectively mean that any field added after ECS 1.2 wouldn't be searchable from the search box until the end of the 7.x line. This is far from ideal. As such, Andrew reopened elastic/beats#14262 for further discussion. I'll chime in with ideas over there.

This situation is Beats-specific, and has no bearing on anyone else using ECS (Elastic Endpoint, partners, users building custom pipelines, etc.) So I've made this entirely self-contained in the generator for the Beats artifact. I want to avoid having the burden to always remember to add "beats.default_fields: false" explicitly for other field additions. This would likely have led to mistakes.

So this implementation is based on the YAML file scripts/generators/beats_default_fields_whitelist.yml. At this time it's populated only with ECS 1.2 fields, as was the original intention of this PR. You'll note that this also gives us the flexibility to strategically add a few more fields to the list, if needed.

I'll wait for a review by @andrewkroh or @ruflin before merging. But I'm good with this implementation now.

Note: the changes to "host" and "rule" definitions are simply trailing whitespaces getting nixed.

@webmat webmat changed the title Add default_field option for Beats generator Workaround for Beats issue with default_field growing too big Dec 20, 2019
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@webmat webmat merged commit 59ff574 into elastic:master Dec 23, 2019
webmat pushed a commit to webmat/ecs that referenced this pull request Dec 23, 2019
…c#687)

This is so that Beats' default_fields don't go above 1024 field limit. See also elastic/beats#14262
webmat pushed a commit to webmat/ecs that referenced this pull request Dec 23, 2019
…c#687)

This is so that Beats' default_fields don't go above 1024 field limit. See also elastic/beats#14262
webmat pushed a commit that referenced this pull request Dec 23, 2019
…709)

This is so that Beats' default_fields don't go above 1024 field limit. See also elastic/beats#14262
webmat pushed a commit that referenced this pull request Dec 23, 2019
…710)

This is so that Beats' default_fields don't go above 1024 field limit. See also elastic/beats#14262
adriansr added a commit to elastic/beats that referenced this pull request Jan 10, 2020
- Update vendored elastic/ecs and fields.ecs.yml from ECS 1.4 branch

Includes elastic/ecs#717 and  elastic/ecs#687 not in released v1.4.0

- Remove field `process.exit_code` from Metricbeat, now in ECS.
adriansr added a commit to adriansr/beats that referenced this pull request Jan 13, 2020
- Update vendored elastic/ecs and fields.ecs.yml from ECS 1.4 branch

Includes elastic/ecs#717 and  elastic/ecs#687 not in released v1.4.0

- Remove field `process.exit_code` from Metricbeat, now in ECS.

(cherry picked from commit cab56a3)
adriansr added a commit to elastic/beats that referenced this pull request Jan 14, 2020
- Update vendored elastic/ecs and fields.ecs.yml from ECS 1.4 branch

Includes elastic/ecs#717 and  elastic/ecs#687 not in released v1.4.0

- Remove field `process.exit_code` from Metricbeat, now in ECS.

(cherry picked from commit cab56a3)
dcode pushed a commit to dcode/ecs that referenced this pull request Apr 15, 2020
…c#687)

This is so that Beats' default_fields don't go above 1024 field limit. See also elastic/beats#14262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants