Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Adding v3 modules for Security_Linux and Security_Windows and Deprecating v1 + v2 #131166

Merged
merged 39 commits into from
May 18, 2022

Conversation

bfilar
Copy link
Contributor

@bfilar bfilar commented Apr 28, 2022

Summary

Files/Job Artifacts:


  • 2 updated manifest .json files - for both linux and windows

  • Updated/new ML Job configurations for 26 jobs - each with associated datafeed configuration files:

    • security_linux: 14 jobs

      • v3_linux_anomalous_network_activity
      • v3_linux_anomalous_network_port_activity_ecs
      • v3_linux_anomalous_process_all_hosts_ecs
      • v3_linux_anomalous_user_name_ecs
      • v3_linux_network_configuration_discovery
      • v3_linux_network_connection_discovery
      • v3_linux_rare_metadata_process
      • v3_linux_rare_metadata_user
      • v3_linux_rare_sudo_user
      • v3_linux_rare_user_compiler
      • v3_linux_system_information_discovery
      • v3_linux_system_process_discovery
      • v3_linux_system_user_discovery
      • v3_rare_process_by_host_linux_ecs
    • security_windows: 12 jobs

      • v3_rare_process_by_host_windows_ecs
      • v3_windows_anomalous_network_activity_ecs
      • v3_windows_anomalous_path_activity_ecs
      • v3_windows_anomalous_process_all_hosts_ecs
      • v3_windows_anomalous_process_creation
      • v3_windows_anomalous_script
      • v3_windows_anomalous_service
      • v3_windows_anomalous_user_name_ecs
      • v3_windows_rare_metadata_process
      • v3_windows_rare_metadata_user
      • v3_windows_rare_user_runas_event
      • v3_windows_rare_user_type10_remote_login

Tests:


Individual job test tracking stats available here: https://docs.google.com/spreadsheets/d/1JOUIVsitaMdEdhM3WT2Eag4ELI-rI2Jec7bXildJsdQ/edit#gid=0

@randomuserid to also post more updates as needed to this issue + regarding tests, thanks

@bfilar bfilar requested review from a team as code owners April 28, 2022 16:38
@bfilar
Copy link
Contributor Author

bfilar commented Apr 28, 2022

@peteharverson @pheyos, the SecML team will coordinate w/ @spong to identify the engineering stakeholders for failing tests. We will seek to resolve these in the coming weeks.

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@spong
Copy link
Member

spong commented Apr 28, 2022

@peteharverson @pheyos, the SecML team will coordinate w/ @spong to identify the engineering stakeholders for failing tests. We will seek to resolve these in the coming weeks.

Sounds good! We have this issue for tracking the rework of these tests in support of the v3 modules: #128318

Ideally we'd like to coordinate that rework as part of this PR to ensure coverage, but if they need to happen as a follow-up as to not block you folks that's fine as well too. So feel free to skip any Security Solution tests (maybe add a TODO: PR 131166 comment above them) that are giving you trouble for the time being.

@randomuserid
Copy link
Contributor

@elasticmachine merge upstream

@peteharverson peteharverson changed the title [DRAFT][ML] Adding v3 modules for Security_Linux and Security_Windows and Deprecating v1 + v2 [ML] Adding v3 modules for Security_Linux and Security_Windows and Deprecating v1 + v2 May 17, 2022
the prefix was in the wrong place
Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested job creation works inside the ML job wizard. LGTM.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 5.0MB 5.0MB -74.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @randomuserid @bfilar

Copy link
Contributor

@lcawl lcawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text strings LGTM

Copy link
Member

@spong spong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked out, tested locally, and verified functionality of the new v3 modules within the Security Solution app. Was able to install and start a job from the ML Job Settings UI, and then created a detection rule referencing the job without issue.

Note I: I did see one warning in the Kibana console after initiating the install, which hits the /api/ml/modules/setup/security_windows_v3 API with this payload:

Payload

image

Kibana Console Warning

[2022-05-17T14:50:44.805-06:00][WARN ][plugins.ml] Data recognizer could not estimate model memory limit {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"[Overall] cardinality estimate required for [by_field_name] [winlog.event_data.ServiceName] but not supplied"}],"type":"illegal_argument_exception","reason":"[Overall] cardinality estimate required for [by_field_name] [winlog.event_data.ServiceName] but not supplied"},"status":400}

@peteharverson, is there something we need to update on our end with this request, or is this expected?


Note II: With regards to the skipped tests, we have this issue for tracking #128318 and @banderror will be following up with a PR to unskip these tests and move them to the new v3 modules.

All that said, LGTM from the Security Solution side of the house! 😀 Excited for the consolidation here -- thanks @bfilar & @randomuserid!

Copy link
Contributor

@yctercero yctercero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looking at the exceptions cypress test (code owners) - if we can track that skip that'd be great.

Copy link
Contributor

@dhurley14 dhurley14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Looks @elastic/security-solution-platform only needed to review a skipped test.

@peteharverson
Copy link
Contributor

is there something we need to update on our end with this request, or is this expected?

[2022-05-17T14:50:44.805-06:00][WARN ][plugins.ml] Data recognizer could not estimate model memory limit {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"[Overall] cardinality estimate required for [by_field_name] [winlog.event_data.ServiceName] but not supplied"}],"type":"illegal_argument_exception","reason":"[Overall] cardinality estimate required for [by_field_name] [winlog.event_data.ServiceName] but not supplied"},"status":400}

@spong in the data set you are using for your tests, are there any docs with winlog.event_data.ServiceName used as the by_field_name for the v3_windows_anomalous_service job? The checks used to estimate the model memory limit for the job will not succeed if they cannot obtain an estimate for the cardinality of some of the fields used in the job. I get the same error using my test data set, where none of the docs contain winlog.event_data.ServiceName.

@bfilar bfilar merged commit f85c39e into main May 18, 2022
@bfilar bfilar deleted the bfilar.ml-refactor-2 branch May 18, 2022 14:33
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label May 18, 2022
@spong
Copy link
Member

spong commented May 18, 2022

@spong in the data set you are using for your tests, are there any docs with winlog.event_data.ServiceName used as the by_field_name for the v3_windows_anomalous_service job? The checks used to estimate the model memory limit for the job will not succeed if they cannot obtain an estimate for the cardinality of some of the fields used in the job. I get the same error using my test data set, where none of the docs contain winlog.event_data.ServiceName.

Ahh, that's indeed the case, my sample data does not include those fields either. Thanks for confirming Pete! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants