Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Switch to ECS Grok patterns in text structure finder and categorization #77065

Closed
4 tasks done
droberts195 opened this issue Aug 31, 2021 · 2 comments
Closed
4 tasks done
Assignees
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team

Comments

@droberts195
Copy link
Contributor

droberts195 commented Aug 31, 2021

#76885 introduced the possibility of using ECS Grok patterns instead of the legacy ones.

We should switch to using these in the text structure plugin and for the Grok patterns we add to categorization results.

It looks like some new date formats exist in the latest set of Grok patterns - certainly for newer versions of Tomcat and Catalina, possibly others - we should add those to the timestamp format finder too.

  • Add ecs_compatibility option to _text_structure/find_structure endpoint, default disabled, and change that endpoint to use ECS Grok patterns if it's set to v1. This may also necessitate making the timestamp format finder aware of two different Grok patterns per timestamp format, and then having it use the appropriate one depending on whether ECS Grok patterns are in use (investigation required).
  • Change UI to set ecs_compatibility to v1 when calling _text_structure/find_structure. [ML] Pass ecs_compatibility=v1 when calling ES find file structure API kibana#138428
  • Have a look through the ECS Grok patterns that were added in ECS support for Grok processor #76885 and see if there are any new timestamp formats that didn't exist in the original Grok patterns. Maybe Tomcat and Catalina have some new ones, maybe others. If any are found add configs for them to the timestamp format finder in _text_structure/find_structure.
  • Change the Grok pattern creator for _ml/anomaly_detectors/<job_id>/results/categories to always use ECS Grok patterns - this change can be made unconditionally without keeping a BWC option for the old Grok patterns, as the functionality is experimental. [ML] Get categories endpoint to use ECS Grok patterns #89386
@droberts195 droberts195 added >enhancement :ml Machine learning labels Aug 31, 2021
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Aug 31, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

edsavage added a commit to edsavage/elasticsearch that referenced this issue Aug 1, 2022
…ure/find_structure endpoint

Also add support for new CATALINA/TOMCAT timestamp formats used by ECS Grok patterns

Relates elastic#77065
pull bot pushed a commit to NOUIY/elasticsearch that referenced this issue Aug 4, 2022
…elastic#88982)

Also add support for new CATALINA/TOMCAT timestamp formats used by ECS Grok patterns

Relates elastic#77065

Co-authored-by: David Roberts <[email protected]>
@droberts195
Copy link
Contributor Author

All tasks complete now - closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

3 participants