Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Exclude analysis fields with core field names from anomaly results #41093

Conversation

yana2301
Copy link
Contributor

@yana2301 yana2301 commented Apr 10, 2019

Added "_index", "_type", "_id" to list of reserved fields.

Closes #39406

@droberts195
Copy link
Contributor

Jenkins test this please

@droberts195 droberts195 added :ml Machine learning v7.2.0 v8.0.0 labels Apr 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@droberts195 droberts195 changed the title Can't create anomaly detector that uses "_index" as "partition_field_… [ML] Exclude analysis fields with core field names from anomaly results Apr 11, 2019
@droberts195
Copy link
Contributor

@yana2301 the elasticsearch-ci/2 CI check is showing that another change is required in the unit tests: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-2/11822/testReport/junit/org.elasticsearch.xpack.core.ml.job.persistence/ElasticsearchMappingsTests/testResultsMapppingReservedFields/

Try running this locally and I think you'll get the same failure:

./gradlew :x-pack:plugin:core:test --tests org.elasticsearch.xpack.core.ml.job.persistence.ElasticsearchMappingsTests

Then you can fix it and confirm the fix locally.

@yana2301
Copy link
Contributor Author

@droberts195 fixed the test

@droberts195
Copy link
Contributor

Jenkins test this please

@droberts195
Copy link
Contributor

@yana2301 now different tests are failing because it's not correct to add the core ES fields like _id and _index into index mappings. You can see which tests by clicking the "Details" links next to the various Jenkins jobs in the "Some checks were not successful" section.

There is a section in ElasticsearchMappingsTests.java that looks like this:

    // These are not reserved because they're Elasticsearch keywords, not
    // field names
    private static List<String> KEYWORDS = Arrays.asList(

I think the correct fix is to revert your second commit on this PR and then add the 3 extra internal field names into that KEYWORDS list. See if that fixes the test that was failing originally without breaking any other tests.

@yana2301 yana2301 force-pushed the 39406-added-index-type-to-reserved-fieldnames branch from 100ec6a to e1a610d Compare April 16, 2019 08:50
@yana2301
Copy link
Contributor Author

@droberts195 the test checks that
ReservedFieldNames.RESERVED_CONFIG_FIELD_NAMES(where I added 3 new fields) =
collectConfigDocFieldNames() - KEYWORDS set.
Now it fails because collectConfigDocFieldNames() does not contain 3 added fields.
Adding new fields to KEYWORDS won't help - since KEYWORDS are excluded from collectConfigDocFieldNames() before comparison(line 97 of ElasticsearchMappingTests).
I tried to fix the test by adding new fields inside collectConfigDocFieldNames() method - as a result multiple tests failed.
Could you tell me please, maybe it makes sense to add these 3 fields just inside the testConfigMapppingReservedFields()?

@droberts195
Copy link
Contributor

Could you tell me please, maybe it makes sense to add these 3 fields just inside the testConfigMapppingReservedFields()?

I think it makes sense to add the 3 extra fields to ReservedFieldNames.RESERVED_CONFIG_FIELD_NAME_ARRAY, because they shouldn't be added as dynamic fields in the config index as well as the results index. If you do that then the test should be OK with the 3 extra fields in KEYWORDS.

@yana2301
Copy link
Contributor Author

@droberts195 pushed the change you suggested, however now testConfigMapppingReservedFields test fails - because collectConfigDocFieldNames() doesn't return _id, _index, _type. Also adding fields to KEYWORDS doesn't help - because fields in KEYWORDS are removed from the result of collectConfigDocFieldNames() before comparison

@droberts195
Copy link
Contributor

@yana2301 please try pushing a third commit to add the 3 extra fields to RESERVED_CONFIG_FIELD_NAME_ARRAY in ReservedFieldNames.java.

(In the first commit you added them to RESERVED_RESULT_FIELD_NAME_ARRAY, but I think the test is showing that they should be in RESERVED_CONFIG_FIELD_NAME_ARRAY too.)

@yana2301
Copy link
Contributor Author

@droberts195 added fields to RESERVED_CONFIG_FIELD_NAME_ARRAY

@droberts195
Copy link
Contributor

Jenkins test this please

@droberts195 droberts195 force-pushed the 39406-added-index-type-to-reserved-fieldnames branch from 7e3a309 to cb0c7e6 Compare April 16, 2019 15:14
@droberts195
Copy link
Contributor

Jenkins test this please

@droberts195
Copy link
Contributor

@yana2301 sorry, my suggestion of putting the extra fields in KEYWORDS was wrong. It needs to be a different list that is added to the expected results, not one that is taken away. I pushed another commit that does this and the tests are running for longer without failing than they did before. Let's leave them to run and see if anything fails in the more complex integration tests later on.

@droberts195
Copy link
Contributor

Jenkins run elasticsearch-ci/default-distro

@droberts195
Copy link
Contributor

Jenkins run elasticsearch-ci/bwc

@droberts195
Copy link
Contributor

Jenkins run elasticsearch-ci/1

@droberts195
Copy link
Contributor

Jenkins test this please

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The tests have passed after merging in the latest changes on the master branch.

Thanks for the contribution @yana2301 - I'll merge this now.

@droberts195 droberts195 merged commit 05c1476 into elastic:master Apr 17, 2019
droberts195 pushed a commit that referenced this pull request Apr 17, 2019
…ts (#41093)

Added "_index", "_type", "_id" to list of reserved fields.

Closes #39406
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…ts (elastic#41093)

Added "_index", "_type", "_id" to list of reserved fields.

Closes elastic#39406
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't create anomaly detector that uses "_index" as "partition_field_name"
4 participants