Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Handle ADD near the end of time #107968

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 7 additions & 19 deletions docs/reference/esql/functions/examples/bucket.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
*Examples*

`BUCKET` can work in two modes: one in which the size of the bucket is computed
based on a buckets count recommendation (four parameters) and a range, and
based on a buckets count recommendation (four parameters) and a range and
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just needed to be regenerated. It's not part of my change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, looks like some changes were made in this asciidoc directly, rather than in org/elasticsearch/xpack/esql/expression/function/grouping/Bucket.java.

@bpintea, could you please double check this? This shouldn't block this PR, but maybe we need a tiny follow-up PR to fix this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex, raised #108005.

another in which the bucket size is provided directly (two parameters).

Using a target number of buckets, a start of a range, and an end of a range,
Expand Down Expand Up @@ -60,8 +60,8 @@ include::{esql-specs}/bucket.csv-spec[tag=docsBucketWeeklyHistogramWithSpan]
include::{esql-specs}/bucket.csv-spec[tag=docsBucketWeeklyHistogramWithSpan-result]
|===

NOTE: When providing the bucket size as the second parameter, it must be a time
duration or date period.
NOTE: When providing the bucket size as the second parameter, its type must be
of a time duration or date period type.

`BUCKET` can also operate on numeric fields. For example, to create a salary histogram:
[source.merge.styled,esql]
Expand All @@ -76,8 +76,8 @@ include::{esql-specs}/bucket.csv-spec[tag=docsBucketNumeric-result]
Unlike the earlier example that intentionally filters on a date range, you rarely want to filter on a numeric range.
You have to find the `min` and `max` separately. {esql} doesn't yet have an easy way to do that automatically.

The range can be omitted if the desired bucket size is known in advance. Simply
provide it as the second argument:
If the desired bucket size is known in advance, simply provide it as the second
argument, leaving the range out:
[source.merge.styled,esql]
----
include::{esql-specs}/bucket.csv-spec[tag=docsBucketNumericWithSpan]
Expand All @@ -87,8 +87,8 @@ include::{esql-specs}/bucket.csv-spec[tag=docsBucketNumericWithSpan]
include::{esql-specs}/bucket.csv-spec[tag=docsBucketNumericWithSpan-result]
|===

NOTE: When providing the bucket size as the second parameter, it must be
of a floating point type.
NOTE: When providing the bucket size as the second parameter, its type must be
of a floating type.

Create hourly buckets for the last 24 hours, and calculate the number of events per hour:
[source.merge.styled,esql]
Expand All @@ -109,15 +109,3 @@ include::{esql-specs}/bucket.csv-spec[tag=bucket_in_agg]
include::{esql-specs}/bucket.csv-spec[tag=bucket_in_agg-result]
|===

`BUCKET` may be used in both the aggregating and grouping part of the
<<esql-stats-by, STATS ... BY ...>> command provided that in the aggregating
part the function is referenced by an alias defined in the
grouping part, or that it is invoked with the exact same expression:
[source.merge.styled,esql]
----
include::{esql-specs}/bucket.csv-spec[tag=reuseGroupingFunctionWithExpression]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/bucket.csv-spec[tag=reuseGroupingFunctionWithExpression-result]
|===
Original file line number Diff line number Diff line change
Expand Up @@ -427,14 +427,38 @@ public static List<TestCaseSupplier> forBinaryNotCasting(
List<TypedDataSupplier> rhsSuppliers,
List<String> warnings,
boolean symmetric
) {
return forBinaryNotCasting(
name,
lhsName,
rhsName,
expected,
expectedType,
lhsSuppliers,
rhsSuppliers,
(lhs, rhs) -> warnings,
symmetric
);
}

public static List<TestCaseSupplier> forBinaryNotCasting(
String name,
String lhsName,
String rhsName,
BinaryOperator<Object> expected,
DataType expectedType,
List<TypedDataSupplier> lhsSuppliers,
List<TypedDataSupplier> rhsSuppliers,
BiFunction<TypedData, TypedData, List<String>> warnings,
boolean symmetric
) {
List<TestCaseSupplier> suppliers = new ArrayList<>();
casesCrossProduct(
expected,
lhsSuppliers,
rhsSuppliers,
(lhsType, rhsType) -> name + "[" + lhsName + "=Attribute[channel=0], " + rhsName + "=Attribute[channel=1]]",
(lhs, rhs) -> warnings,
warnings,
suppliers,
expectedType,
symmetric
Expand Down Expand Up @@ -970,6 +994,12 @@ public static List<TypedDataSupplier> dateCases() {
// 2286-11-20T17:46:40Z - +292278994-08-17T07:12:55.807Z
() -> ESTestCase.randomLongBetween(10 * (long) 10e11, Long.MAX_VALUE),
DataTypes.DATETIME
),
new TypedDataSupplier(
"<near the end of time>",
// very close to +292278994-08-17T07:12:55.807Z, the maximum supported millis since epoch
() -> ESTestCase.randomLongBetween(Long.MAX_VALUE / 100 * 99, Long.MAX_VALUE),
DataTypes.DATETIME
)
);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,12 @@
import java.math.BigInteger;
import java.time.Duration;
import java.time.Period;
import java.time.temporal.TemporalAmount;
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.function.BiFunction;
import java.util.function.BinaryOperator;
import java.util.function.Supplier;

import static org.elasticsearch.xpack.esql.expression.predicate.operator.arithmetic.AbstractArithmeticTestCase.arithmeticExceptionOverflowCase;
Expand Down Expand Up @@ -116,29 +119,35 @@ public static Iterable<Object[]> parameters() {
)
);

BinaryOperator<Object> result = (lhs, rhs) -> {
try {
return addDatesAndTemporalAmount(lhs, rhs);
} catch (ArithmeticException e) {
return null;
}
};
BiFunction<TestCaseSupplier.TypedData, TestCaseSupplier.TypedData, List<String>> warnings = (lhs, rhs) -> {
try {
addDatesAndTemporalAmount(lhs.data(), rhs.data());
return List.of();
} catch (ArithmeticException e) {
return List.of(
"Line -1:-1: evaluation of [] failed, treating result as null. Only first 20 failures recorded.",
"Line -1:-1: java.lang.ArithmeticException: long overflow"
);
}
};
suppliers.addAll(
TestCaseSupplier.forBinaryNotCasting(
// TODO: There is an evaluator for Datetime + Period, so it should be tested. Similarly below.
"No evaluator, the tests only trigger the folding code since Period is not representable",
"lhs",
"rhs",
(lhs, rhs) -> {
// this weird casting dance makes the expected value lambda symmetric
Long date;
Period period;
if (lhs instanceof Long) {
date = (Long) lhs;
period = (Period) rhs;
} else {
date = (Long) rhs;
period = (Period) lhs;
}
return asMillis(asDateTime(date).plus(period));
},
result,
DataTypes.DATETIME,
TestCaseSupplier.dateCases(),
TestCaseSupplier.datePeriodCases(),
List.of(),
warnings,
true
)
);
Expand All @@ -148,23 +157,11 @@ public static Iterable<Object[]> parameters() {
"No evaluator, the tests only trigger the folding code since Duration is not representable",
"lhs",
"rhs",
(lhs, rhs) -> {
// this weird casting dance makes the expected value lambda symmetric
Long date;
Duration duration;
if (lhs instanceof Long) {
date = (Long) lhs;
duration = (Duration) rhs;
} else {
date = (Long) rhs;
duration = (Duration) lhs;
}
return asMillis(asDateTime(date).plus(duration));
},
result,
DataTypes.DATETIME,
TestCaseSupplier.dateCases(),
TestCaseSupplier.timeDurationCases(),
List.of(),
warnings,
true
)
);
Expand Down Expand Up @@ -269,6 +266,20 @@ private static String addErrorMessageString(boolean includeOrdinal, List<Set<Dat
}
}

private static Object addDatesAndTemporalAmount(Object lhs, Object rhs) {
// this weird casting dance makes the expected value lambda symmetric
Long date;
TemporalAmount period;
if (lhs instanceof Long) {
date = (Long) lhs;
period = (TemporalAmount) rhs;
} else {
date = (Long) rhs;
period = (TemporalAmount) lhs;
}
return asMillis(asDateTime(date).plus(period));
}

@Override
protected Expression build(Source source, List<Expression> args) {
return new Add(source, args.get(0), args.get(1));
Expand Down
Loading