Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't parse unmapped array field when dynamic is set to false #85082

Merged
merged 6 commits into from
Mar 28, 2022

Conversation

javanna
Copy link
Member

@javanna javanna commented Mar 17, 2022

When parsing array fields from incoming documents and dynamic for the current context is set to false, we should not go ahead and parse all the children, but rather skip parsing. We had a TODO for a very long time in the code around this, which this commit addresses.

@javanna javanna added :Search Foundations/Mapping Index mappings, including merging and defining field types v8.2.0 >refactoring labels Mar 18, 2022
@javanna javanna marked this pull request as ready for review March 18, 2022 17:04
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 18, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna
Copy link
Member Author

javanna commented Mar 18, 2022

@romseygeek this is a fun one for you...

@romseygeek
Copy link
Contributor

Hah, that TODO has been bugging me for ages as well. Do we have any tests that explicitly check for this case?

@javanna
Copy link
Member Author

javanna commented Mar 23, 2022

Good question. I can't come up with a reason that explains why we'd need to parse arrays when they are not explicitly mapped and they inherit dynamic false from some parent object. We already skip entire objects in the same situation, hence the same behaviour should be applicable to arrays too. Intuitively, for any subfield of the array that needs to be parsed, the whole object structure would need to be explicitly mapped including the array field, or at least one of its parents would need to override the dynamic behaviour to true, which again would require the array field to be mapped.

I pushed a new commit with some more tests around arrays parsing, but while they increase coverage, this proposed change does not affect them in any way. Yet working on those tests gave me more confidence about making this change.

I ran tests against 7.17 with this same change (see #85275) and they were fine, meaning that even before we started reinterpreting dots in field names while parsing documents (#79922), there would be no good reason to parse arrays for unmapped fields when dynamic is false (or at least we have no tests that cover this case).

I did some archeology digging, and I traced the introduction of parsing arrays when dynamic is false to #7175 , which was a bugfix around geo_point fields and had nothing to do with parsing arrays with non dynamic mappings. There was no specific test introduced for the dynamic false scenario but I believe it maintained the previous behaviour: continue to parse although nothing will be dynamically mapped nor indexed. That line survived through the years and many refactorings, and became part of what we today know as DocumentParser, although that catchy TODO was added to indicate that it should not be needed to parse arrays in that scenario.

The test that I introduced with #85081 would fail on 7.x or 8.0, and get fixed replacing parsing of arrays with skipping them. In 8.1 though such change is no longer necessary as we reinterpret dots in field names which fixed the same issue (#65333), effectively zero-ing any effect that this line has today, whereas the only known effect it had before was a buggy behaviour that made us parse array values with dots in their names within the wrong field.

I am positive it's time to address this TODO, what do you think?

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for it.

@javanna javanna merged commit c3b2175 into elastic:master Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team >tech debt v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants