Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize user_agent migration to ECS #10441

Merged
merged 7 commits into from
Feb 4, 2019

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Jan 30, 2019

This PR is built on top of #10472 and should be merged after it.

In 7.0, the output of ingest node "user_agent" processor will match ECS by default (elastic/elasticsearch#37984, elastic/elasticsearch#38094). This means we can remove a lot of temporary field renames that were previously needed, to make its output follow ECS.

This PR will fail CI until the ES Docker image used in CI includes the changes from both issues linked above. Both of them are merged to master already, so we're just waiting for the next successful artifact build for ES.

Filebeat datasets affected:

  • apache.access
  • iis.access
  • nginx.access
  • suricata.eve (x-pack)
  • traefik.access
  • zeek.http (x-pack)

This PR also fixes a few mistakes that were made in ecs-migration.yml.

Caveat

This PR introduces multiple changes / removals of the @timestamp field in the test files because of an unrelated change in how Elasticsearch parses timestamps. This will be fixed in other PRs.

TODO

@webmat webmat requested review from a team as code owners January 30, 2019 20:19
@webmat webmat self-assigned this Jan 30, 2019
@webmat webmat added in progress Pull request is currently in progress. module Filebeat Filebeat ecs labels Jan 30, 2019
@ruflin ruflin mentioned this pull request Jan 30, 2019
@webmat webmat changed the title WIP Clean up user_agent field renames, now that its output matches ECS WIP Remove user_agent field renames Jan 30, 2019
@webmat webmat force-pushed the ecs-user-agent-cleanup branch from f0c91d4 to 78b36d4 Compare January 31, 2019 13:39
"ignore_failure": true
}
}, {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see that some much code can be removed.

@webmat
Copy link
Contributor Author

webmat commented Jan 31, 2019

I'm trying to make this PR work for the user_agent without this fix elastic/elasticsearch#38094.

Not sure if CI incorporates yesterday's change, which causes a mapping error. If it does, I want this PR to be merged to resolve the issue, and I'll do a follow-up PR once the final fix is in, to remove the last rename.

@webmat
Copy link
Contributor Author

webmat commented Jan 31, 2019

The fun meter raises by another level: I just noticed that the ECS definition of user_agent.device.name was imported, but then was later overridden by a "temporary" field definition, defining user_agent.device as keyword...

@webmat webmat changed the title WIP Remove user_agent field renames WIP Finalize user_agent migration to ECS Jan 31, 2019
@webmat webmat force-pushed the ecs-user-agent-cleanup branch from ab773d8 to 7942a0b Compare January 31, 2019 21:19
@webmat webmat requested review from a team as code owners January 31, 2019 21:19
@webmat webmat added review in progress Pull request is currently in progress. and removed in progress Pull request is currently in progress. review labels Feb 1, 2019
@webmat webmat force-pushed the ecs-user-agent-cleanup branch from 7942a0b to 3c0af2d Compare February 4, 2019 14:10
@webmat webmat requested a review from a team as a code owner February 4, 2019 14:10
@webmat webmat mentioned this pull request Feb 4, 2019
@webmat webmat changed the title WIP Finalize user_agent migration to ECS Finalize user_agent migration to ECS Feb 4, 2019
@webmat webmat added review and removed in progress Pull request is currently in progress. labels Feb 4, 2019
Copy link
Collaborator

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Should we add a changelog entry for this?

For the timestamp changes I would suggest we ignore it for now to make sure CI is green as we already work on fixing it. @ycombinator objections?

"user_agent.name": "IE",
"user_agent.original": "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.3; WOW64; Trident/7.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR[ 2.0.50727](tel: 2050727); .NET CLR 3.0.30729)",
"user_agent.os.full_name": "Windows 8.1",
"user_agent.os.name": "Windows 8.1"
"user_agent.os.name": "Windows 8.1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expected? I see below this goes under full?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ES bug. Will file this soon.

@ruflin
Copy link
Collaborator

ruflin commented Feb 4, 2019

Canceled Windows jobs can be ignored.

@ycombinator
Copy link
Contributor

For the timestamp changes I would suggest we ignore it for now to make sure CI is green as we already work on fixing it. @ycombinator objections?

No objections from me. I'm +1 to deal with those changes in separate PR(s).

@webmat webmat merged commit 0fd59ee into elastic:master Feb 4, 2019
@webmat webmat deleted the ecs-user-agent-cleanup branch February 4, 2019 18:34
@ruflin
Copy link
Collaborator

ruflin commented Feb 4, 2019

@webmat @ycombinator Let's make sure that any follow up PR to fix the timestamp changes / fixes the timestamps we have remove / changed here.

@webmat
Copy link
Contributor Author

webmat commented Feb 4, 2019

It's mostly on the ES side, though.

We may proactively be able to circumvent some of parsing problems in our pipelines. But since they're regressions in ts parsing, they should also be fixed in ES.

@ycombinator
Copy link
Contributor

We may proactively be able to circumvent some of parsing problems in our pipelines. But since they're regressions in ts parsing, they should also be fixed in ES.

++ to trying to fix on the ES side. I've been trying to adapt our pipelines in a way that would work with . or , as the delimiter between seconds and milliseconds as well as with timestamps that do or do not contain time zone offsets in them. I haven't found a way to make all 4 of those cases work as expected yet. I'll put up a PR to demonstrate the challenges.

@webmat
Copy link
Contributor Author

webmat commented Feb 4, 2019

@ycombinator Yes, even if never merged, would make it easy for me to 👀 .

Otherwise I doubt I'll get to that in time :-\

@ycombinator
Copy link
Contributor

@webmat #10543

jsoriano added a commit to jsoriano/beats that referenced this pull request Feb 11, 2019
There was a regression in ES when parsing timezones and we adapted the
expected results to the incorrect values in elastic#10441. Revert these changes
when the fixed ES images are ready.
jsoriano added a commit that referenced this pull request Feb 11, 2019
There was a regression in ES when parsing timezones and we adapted the
expected results to the incorrect values in #10441. Revert these changes
when the fixed ES images are ready.
jsoriano added a commit to jsoriano/beats that referenced this pull request Feb 11, 2019
There was a regression in ES when parsing timezones and we adapted the
expected results to the incorrect values in elastic#10441. Revert these changes
when the fixed ES images are ready.

(cherry picked from commit f9be490)
jsoriano added a commit that referenced this pull request Feb 11, 2019
…0678)

There was a regression in ES when parsing timezones and we adapted the
expected results to the incorrect values in #10441. Revert these changes
when the fixed ES images are ready.

(cherry picked from commit f9be490)
DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
This adjusts to the updated user_agent parser, which is now in line with ECS schema.

- major/minor/patch fields have been removed from the field definitions
- multiple field renames in Filebeat module ingest node pipelines are now removed
- remove Fb module test skipping which was added due to the latest ES build
- delete the `user_agent` field nested in the Suricata EVE events, as it's now an alias
- Caveat to this PR: the latest ES build has issues with timestamp parsing, so `@timestamp` is sometimes missing, or ignoring the timezone. This will be resolved in ES, and potentially a separate Beat PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants