Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple field names/patterns for (path_)(un)match (#66364) #95558

Merged
merged 11 commits into from
Apr 27, 2023

Conversation

quux00
Copy link
Contributor

@quux00 quux00 commented Apr 25, 2023

Arrays of patterns are now allowed for dynamic_templates in the match, unmatch, path_match and path_unmatch fields. DynamicTemplate has been modified to support List for these fields. The patterns can be either simple wildcards or regex. As with previous functionality, mixing of wildcards and regex will not throw an error, but will not work as expected at mapping time.

One new error pathway was added: if a user specifies a list of non-strings for one of these pattern fields (e.g., "match": [10, false]) a MapperParserException will be thrown.

Closes #66364.

@quux00 quux00 added >enhancement WIP :Search Foundations/Mapping Index mappings, including merging and defining field types v8.8.0 labels Apr 25, 2023
@github-actions
Copy link
Contributor

Documentation preview:

@quux00 quux00 requested a review from romseygeek April 25, 2023 20:04
@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Apr 25, 2023
@jdconrad jdconrad removed the external-contributor Pull request authored by a developer outside the Elasticsearch team label Apr 25, 2023
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a great start, thanks @quux00 . I left a few suggestions.

@gmarouli gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023
@quux00 quux00 force-pushed the mapping/dyntemplates-with-arrays branch from 5708590 to 1445094 Compare April 26, 2023 21:13
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment, LGTM otherwise!

assertEquals("ip", fieldMapper.typeName());

// this one will not match and be an IP field because it was specified with a regex but match_pattern is implicit "simple"
assertNotEquals(InetAddressPoint.class, doc.getField("iptwo").getClass());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's do this in a follow-up, no need to complicate this PR.

@quux00 quux00 force-pushed the mapping/dyntemplates-with-arrays branch from 1445094 to e203c28 Compare April 27, 2023 13:12
@quux00 quux00 removed the WIP label Apr 27, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Apr 27, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

quux00 added 10 commits April 27, 2023 10:41
Arrays of patterns are now allowed for dynamic_templates in the match,
unmatch, path_match and path_unmatch fields. DynamicTemplate has been modified to
support List<String> for these fields. The patterns can be either simple wildcards
or regex. As with previous functionality, mixing of wildcards and regex will not
throw an error, but will not work as expected at mapping time.

One new error pathway was added: if a user specifies a list of non-strings for
one of these pattern fields (e.g., "match": [10, false]) a MapperParserException
will be thrown.

Closes elastic#66364.
@quux00 quux00 force-pushed the mapping/dyntemplates-with-arrays branch from e203c28 to a0c7c39 Compare April 27, 2023 14:42
matchList.add(s);
} else if (entry.getValue() instanceof List<?> ls) {
for (Object o : ls) {
if (o instanceof String s) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we were previously calling toString here, which makes me suspect we may have accepted numbers and other things. Probably not a feature, but if that is the case changing it for single values may break existing dynamic templates? Could you verify this?

Copy link
Contributor Author

@quux00 quux00 May 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct - it used to do a toString() on the JSON field (which is why an entry like ["foo", "bar"] was changed to the string "[\"foo"\, \"bar\"]" as noted by the original reporter of the request.

In this PR, I noted in the commit message that I changed that I added a new error case:

One new error pathway was added: if a user specifies a list of non-strings for one of these pattern fields (e.g., "match": [10, false]) a MapperParserException will be thrown.

Which is in this code block you highlighted:

       if (entry.getValue() instanceof String s) {
            matchList.add(s);
        } else if (entry.getValue() instanceof List<?> ls) {
            for (Object o : ls) {
                if (o instanceof String s) {
                    matchList.add(s);
                } else {
                    throw new MapperParsingException(
                        Strings.format("[%s] values must either be a string or list of strings, but was [%s]", propName, entry.getValue())
                    );
                }
            }
        } else {
            throw new MapperParsingException(
                Strings.format("[%s] values must either be a string or list of strings, but was [%s]", propName, entry.getValue())
            );
        }

Good point that this may be a breaking change. You are correct that numbers are accepted in 8.8 and before:

PUT my-index-000001
{
  "mappings": {
    "dynamic_templates": [
      {
        "mytemplate": {
          "match_pattern": "simple",
          "match": 12,
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}

// the mapping gets converted to string
// here is the _mapping result
    "mappings": {
      "dynamic_templates": [
        {
          "mytemplate": {
            "match": "12",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ]
    }

So to not be a breaking change, the code logic here would need to be:

        if (entry.getValue() instanceof List<?> ls) {
            for (Object o : ls) {
                matchList.add(o.toString());
            }
        } else {
            matchList.add(entry.getValue().toString());
        }

Is that the recommended code change here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could consider being more strict when parsing from the array, and it would make sense to enforce a single type there, rather than calling toString blindly on all kinds of different types? I guess we are stuck with toString for the single element case? What do you think?

Copy link
Contributor Author

@quux00 quux00 May 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's right. Since parsing ["foo", "bar"] as an array is new functionality, how we treat the elements in the array can be changed since that's new, but with the single element change, we might break things if we change. I'll make a new issue and change it. Good catch.

quux00 added a commit to quux00/elasticsearch that referenced this pull request May 2, 2023
Match and unmatch fields in dynamic_templates (match, unmatch, path_match, path_unmatch)
can either be single-valued or a list of values. If a list of values is provided, the
values must all be strings. Numbers, boolean, other lists, etc. are not allowed and an
error will be returned to the user in that case.

However, previously (before elastic#95558) we allowed any JSON type for the match/unmatch
fields (not just string), so changing that would be a breaking change. Thus, this commit
no longer enforces string-only types for single-valued match/unmatch fields in dynamic_templates.
quux00 added a commit that referenced this pull request May 2, 2023
* Fixes breaking change introduced in #95558.

Match and unmatch fields in dynamic_templates (match, unmatch, path_match, path_unmatch)
can either be single-valued or a list of values. If a list of values is provided, the
values must all be strings. Numbers, boolean, other lists, etc. are not allowed and an
error will be returned to the user in that case.

However, previously (before #95558) we allowed any JSON type for the match/unmatch
fields (not just string), so changing that would be a breaking change. Thus, this commit
no longer enforces string-only types for single-valued match/unmatch fields in dynamic_templates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow multiple field names/patterns for (path_)(un)match
6 participants