-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn user if regular expression is used when match_pattern defaults to simple in dynamic_template mappings #95634
Comments
Pinging @elastic/es-search (Team:Search) |
Under what conditions should we put a warning header about the pattern_match="simple" case. I can think of two cases:
The latter is the goal of this ticket. Should we also warn on case 1? |
Additional problem. ES seems to allow most regex "meta-characters" in field names. Based on my testing, these characters are allowed in ES field names and can be queried.
And So the only thing we can test for that is definitely not allowed is a match pattern that starts with Example:
|
…o simple in dynamic_template mappings In a dynamic_templates setting, if match_phrase=simple (explicitly or by default) and the pattern passed into match, unmatch, path_match or path_unmatch appears to be a regular expression instead a warning is added to the HTTP response header. Closes elastic#95634
…o simple in dynamic_template mappings (#95750) * Warn user if regular expression is used when match_pattern defaults to simple in dynamic_template mappings If a user does not explicitly set match_pattern in a dynamic template mapping, it defaults to 'simple' wildcards, but the user might supply a regex rather than simple glob-style wildcard for one of the "match" fields. In that case, the mapping will be saved without error or warning, but at indexing time the mappings will not work as the user intended and it may not be clear why. This change now sets a warning in that context - when match_pattern was not defined and the user appears to be specifying a regex, we provide an HTTP header warning for all the regex patterns we detect. Because ES fields allow most of the standard regex meta-characters (such as |, ^, $ and paired braces and brackets), our warning may be unwarranted, so the user can make the warning go away by setting match_pattern=simple in the dynamic template. Closes #95634
Description
This was flagged by @romseygeek in #95558 (comment) to be targeted for a new issue.
In dynamic_template mappings, 'simple' match types (which is the default setting) will happily accept regexes (rather than simple wildcards) without error or warning, but then they will not work as the user likely intended. We should emit a warning if the user provides a pattern that 'looks like' a regex, but haven't explicitly set match_type.
Deeper context:
The validate method in DynamicTemplate.parse tries to catch these, but it only catches the invalid pattern when match_pattern=regex. For example, see the test: https://github.com/elastic/elasticsearch/blob/main/server/src/test/java/org/elasticsearch/index/mapper/DynamicTemplateParseTests.java#L132
Current behavior is:
fails with:
but:
(where
"match_pattern": "simple"
is implied) works with no problems (until you try to actually put documents in with the mapping, of course).We could probably add a method to Regex.java. There is already this function
but that's not enough for this scenario.
This code indicates that the only allowed patterns for "simple" are "xxx*", "xxx", "xxx" and "xxxyyy", so we could check for things like .* and ^xxx and yyy$ and chars like [, ] and | and maybe parens and then run it through REGEX.matches() and if that does NOT fail, then it appears to be a valid actual regex, not a simple pattern and then give a warning to users.
The text was updated successfully, but these errors were encountered: