Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Store anomaly detection config file and input on demand #110582

Merged

Conversation

valeriy42
Copy link
Contributor

@valeriy42 valeriy42 commented Jul 8, 2024

DO NOT MERGE THIS INTO main!

This PR enables the storage of data and configuration of an anomaly detection job in files so it can be reproduced using the autodetect process without Elasticsearch.

To enable the storage, specify keep_job_data parameter in the custom_settings parameter of the job config:

  "custom_settings": {
    "keep_job_data": "true"
    } 

Now, start the job and watch for a log message with the autodetect command similar to the following:

[2024-06-19T16:03:38,248][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [Elastic-MBP.fritz.box] Autodetect process command: [./autodetect, --lengthEncodedInput, --maxAnomalyRecords=500, --validElasticLicenseKeyConfirmed=true, --config=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/config10764979302390040373.json, --logPipe=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_log_45530, --input=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530, --inputIsPipe, --output=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_output_45530, --outputIsPipe, --persist=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_persist_45530, --persistIsPipe, --namedPipeConnectTimeout=10]

and

[2024-06-19T15:29:08,640][INFO ][o.e.x.m.p.w.LengthEncodedWriter]  Opening file: /var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530 for writing.

Copy the config file, the persist file from the first message, and the input file from the second message.

@valeriy42 valeriy42 requested a review from a team as a code owner July 8, 2024 11:27
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 8, 2024
@valeriy42 valeriy42 added >non-issue :ml Machine learning and removed needs:triage Requires assignment of a team area label labels Jul 8, 2024
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jul 8, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@valeriy42 valeriy42 merged commit 411ddcc into elastic:record-anomaly-detection-messages Jul 8, 2024
2 of 15 checks passed
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Jul 24, 2024
…#110582)

DO NOT MERGE THIS INTO `main`!

This PR enables the storage of data and configuration of an anomaly detection job in files so it can be reproduced using the `autodetect` process without Elasticsearch.

To enable the storage, specify `keep_job_data` parameter in the `custom_settings` parameter of the job config:

```json
  "custom_settings": {
    "keep_job_data": "true"
    } 
```

Now, start the job and watch for a log message with the autodetect command similar to the following:

```bash
[2024-06-19T16:03:38,248][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [Elastic-MBP.fritz.box] Autodetect process command: [./autodetect, --lengthEncodedInput, --maxAnomalyRecords=500, --validElasticLicenseKeyConfirmed=true, --config=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/config10764979302390040373.json, --logPipe=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_log_45530, --input=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530, --inputIsPipe, --output=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_output_45530, --outputIsPipe, --persist=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_persist_45530, --persistIsPipe, --namedPipeConnectTimeout=10]
```
and
```bash
[2024-06-19T15:29:08,640][INFO ][o.e.x.m.p.w.LengthEncodedWriter]  Opening file: /var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530 for writing.
```

Copy the config file, the persist file from the first message, and the input file from the second message.
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Jul 24, 2024
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Aug 6, 2024
…#110582)

DO NOT MERGE THIS INTO `main`!

This PR enables the storage of data and configuration of an anomaly detection job in files so it can be reproduced using the `autodetect` process without Elasticsearch.

To enable the storage, specify `keep_job_data` parameter in the `custom_settings` parameter of the job config:

```json
  "custom_settings": {
    "keep_job_data": "true"
    } 
```

Now, start the job and watch for a log message with the autodetect command similar to the following:

```bash
[2024-06-19T16:03:38,248][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [Elastic-MBP.fritz.box] Autodetect process command: [./autodetect, --lengthEncodedInput, --maxAnomalyRecords=500, --validElasticLicenseKeyConfirmed=true, --config=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/config10764979302390040373.json, --logPipe=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_log_45530, --input=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530, --inputIsPipe, --output=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_output_45530, --outputIsPipe, --persist=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_persist_45530, --persistIsPipe, --namedPipeConnectTimeout=10]
```
and
```bash
[2024-06-19T15:29:08,640][INFO ][o.e.x.m.p.w.LengthEncodedWriter]  Opening file: /var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530 for writing.
```

Copy the config file, the persist file from the first message, and the input file from the second message.
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants