Skip to content

Commit

Permalink
ESQL: Fix for overzealous validation in case of invalid mapped fields (
Browse files Browse the repository at this point in the history
…#111475)

Fix validation of fields mapped to different types in different indices and align with validation of fields of unsupported type.

* Allow using multi-typed fields in KEEP and DROP, just like unsupported fields.
* Explicitly invalidate using both these field kinds in RENAME.
* Map both kinds of fields to UnsupportedAttribute to enforce consistency.
* Consider convert functions containing valid multi-typed fields as resolved to avoid weird workarounds when resolving STATS.
* Add a bunch of tests.
  • Loading branch information
alex-spies authored Aug 9, 2024
1 parent e235880 commit 585480f
Show file tree
Hide file tree
Showing 19 changed files with 738 additions and 118 deletions.
6 changes: 6 additions & 0 deletions docs/changelog/111475.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 111475
summary: "ESQL: Fix for overzealous validation in case of invalid mapped fields"
area: ES|QL
type: bug
issues:
- 111452
8 changes: 3 additions & 5 deletions docs/reference/esql/esql-multi-index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,7 @@ In addition, if the query refers to this unsupported field directly, the query f
[source.merge.styled,esql]
----
FROM events_*
| KEEP @timestamp, client_ip, event_duration, message
| SORT @timestamp DESC
| SORT client_ip DESC
----

[source,bash]
Expand All @@ -118,9 +117,8 @@ experimental::[]
{esql} has a way to handle <<esql-multi-index-invalid-mapping, field type mismatches>>. When the same field is mapped to multiple types in multiple indices,
the type of the field is understood to be a _union_ of the various types in the index mappings.
As seen in the preceding examples, this _union type_ cannot be used in the results,
and cannot be referred to by the query
-- except when it's passed to a type conversion function that accepts all the types in the _union_ and converts the field
to a single type. {esql} offers a suite of <<esql-type-conversion-functions,type conversion functions>> to achieve this.
and cannot be referred to by the query -- except in `KEEP`, `DROP` or when it's passed to a type conversion function that accepts all the types in
the _union_ and converts the field to a single type. {esql} offers a suite of <<esql-type-conversion-functions,type conversion functions>> to achieve this.

In the above examples, the query can use a command like `EVAL client_ip = TO_IP(client_ip)` to resolve
the union of `ip` and `keyword` to just `ip`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,11 @@ public Nullability nullable() {
return child.nullable();
}

@Override
protected TypeResolution resolveType() {
return child.resolveType();
}

@Override
public DataType dataType() {
return child.dataType();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ public boolean synthetic() {
return synthetic;
}

/**
* Try to return either {@code this} if it is an {@link Attribute}, or a {@link ReferenceAttribute} to it otherwise.
* Return an {@link UnresolvedAttribute} if this is unresolved.
*/
public abstract Attribute toAttribute();

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import org.elasticsearch.xpack.esql.core.expression.Expression.TypeResolution;
import org.elasticsearch.xpack.esql.core.type.DataType;
import org.elasticsearch.xpack.esql.core.type.EsField;
import org.elasticsearch.xpack.esql.core.type.InvalidMappedField;

import java.util.Locale;
import java.util.StringJoiner;
Expand Down Expand Up @@ -176,19 +177,60 @@ public static TypeResolution isType(
ParamOrdinal paramOrd,
String... acceptedTypes
) {
return predicate.test(e.dataType()) || e.dataType() == NULL
? TypeResolution.TYPE_RESOLVED
: new TypeResolution(
format(
null,
"{}argument of [{}] must be [{}], found value [{}] type [{}]",
paramOrd == null || paramOrd == DEFAULT ? "" : paramOrd.name().toLowerCase(Locale.ROOT) + " ",
operationName,
acceptedTypesForErrorMsg(acceptedTypes),
name(e),
e.dataType().typeName()
)
);
return isType(e, predicate, operationName, paramOrd, false, acceptedTypes);
}

public static TypeResolution isTypeOrUnionType(
Expression e,
Predicate<DataType> predicate,
String operationName,
ParamOrdinal paramOrd,
String... acceptedTypes
) {
return isType(e, predicate, operationName, paramOrd, true, acceptedTypes);
}

public static TypeResolution isType(
Expression e,
Predicate<DataType> predicate,
String operationName,
ParamOrdinal paramOrd,
boolean allowUnionTypes,
String... acceptedTypes
) {
if (predicate.test(e.dataType()) || e.dataType() == NULL) {
return TypeResolution.TYPE_RESOLVED;
}

// TODO: Shouldn't we perform widening of small numerical types here?
if (allowUnionTypes
&& e instanceof FieldAttribute fa
&& fa.field() instanceof InvalidMappedField imf
&& imf.types().stream().allMatch(predicate)) {
return TypeResolution.TYPE_RESOLVED;
}

return new TypeResolution(
errorStringIncompatibleTypes(operationName, paramOrd, name(e), e.dataType(), acceptedTypesForErrorMsg(acceptedTypes))
);
}

private static String errorStringIncompatibleTypes(
String operationName,
ParamOrdinal paramOrd,
String argumentName,
DataType foundType,
String... acceptedTypes
) {
return format(
null,
"{}argument of [{}] must be [{}], found value [{}] type [{}]",
paramOrd == null || paramOrd == DEFAULT ? "" : paramOrd.name().toLowerCase(Locale.ROOT) + " ",
operationName,
acceptedTypesForErrorMsg(acceptedTypes),
argumentName,
foundType.typeName()
);
}

private static String acceptedTypesForErrorMsg(String... acceptedTypes) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,11 @@ public UnresolvedAttribute withUnresolvedMessage(String unresolvedMessage) {
return new UnresolvedAttribute(source(), name(), id(), unresolvedMessage, resolutionMetadata());
}

@Override
protected TypeResolution resolveType() {
return new TypeResolution("unresolved attribute [" + name() + "]");
}

@Override
public DataType dataType() {
throw new UnresolvedException("dataType", this);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import java.util.Objects;
import java.util.Set;
import java.util.TreeMap;
import java.util.stream.Collectors;

/**
* Representation of field mapped differently across indices.
Expand Down Expand Up @@ -64,6 +65,10 @@ private InvalidMappedField(StreamInput in) throws IOException {
this(in.readString(), in.readString(), in.readImmutableMap(StreamInput::readString, i -> i.readNamedWriteable(EsField.class)));
}

public Set<DataType> types() {
return typesToIndices.keySet().stream().map(DataType::fromTypeName).collect(Collectors.toSet());
}

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(getName());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2018,6 +2018,19 @@ M
null
;

shadowingInternalWithGroup2#[skip:-8.14.1,reason:implemented in 8.14]
FROM employees
| STATS x = MAX(emp_no), y = count(x) BY x = emp_no, x = gender
| SORT x ASC
;

y:long | x:keyword
33 | F
57 | M
0 | null
;


shadowingTheGroup
FROM employees
| STATS gender = MAX(emp_no), gender = MIN(emp_no) BY gender
Expand Down
Loading

0 comments on commit 585480f

Please sign in to comment.