Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Implement DATE_PART function #47206

Merged
merged 8 commits into from
Oct 1, 2019
Merged

SQL: Implement DATE_PART function #47206

merged 8 commits into from
Oct 1, 2019

Conversation

matriv
Copy link
Contributor

@matriv matriv commented Sep 27, 2019

DATE_PART(, <date/datetime>) is a function that allows
the user to extract the specified unit from a date/datetime field
similar to the EXTRACT ( FROM <date/datetime>) but
with different names and aliases for the units and it also provides more
options like DATE_PART('tzoffset', datetimeField).

Implemented following the SQL server's spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017
with the difference that the argument is either a
literal single quoted string or gets a value from a table field, whereas
in SQL server keywords are used (unquoted identifiers) and it's not
possible to use a value coming for a table column.

Closes: #46372

DATE_PART(<datetime unit>, <date/datetime>) is a function that allows
the user to extract the specified unit from a date/datetime field
similar to the EXTRACT (<datetime unit> FROM <date/datetime>) but
with different names and aliases for the units and it also provides more
options like `DATE_PART('tzoffset', datetimeField)`.

Implemented following the SQL server's spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017
with the difference that the <datetime unit> argument is either a
literal single quoted string or gets a value from a table field, whereas
in SQL server keywords are used (unquoted identifiers) and it's not
possible to use a value coming for a table column.

Closes: elastic#46372
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@matriv
Copy link
Contributor Author

matriv commented Sep 27, 2019

This implementation is missing the 3rd argument where the Sunday or Monday can be specified.
It will be added with a followup PR.

@matriv
Copy link
Contributor Author

matriv commented Sep 27, 2019

@elasticmachine run elasticsearch-ci/2
@elasticmachine run elasticsearch-ci/docs-check

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return zoneId.equals(that.zoneId);
BinaryDateTimePipe that = (BinaryDateTimePipe) o;
return Objects.equals(zoneId, that.zoneId) &&
operation == that.operation;
}

@Override
public int hashCode() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move hashCode before equals.

HOUR(DateTimeExtractor.HOUR_OF_DAY::extract, "hours", "hh"),
MINUTE(DateTimeExtractor.MINUTE_OF_HOUR::extract, "minutes", "mi", "n"),
SECOND(DateTimeExtractor.SECOND_OF_MINUTE::extract, "seconds", "ss", "s"),
MILLISECOND(dt -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dt -> dt.get(...)

}
}

if (!(source2 instanceof ZonedDateTime)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

== false

@@ -45,4 +45,8 @@ public String visitQualifiedName(QualifiedNameContext ctx) {
private static String unquoteIdentifier(String identifier) {
return identifier.replace("\"\"", "\"");
}

static String quoteIdentifier(String identifier) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere, must be removed, thx!

@@ -110,9 +103,11 @@
NANOSECOND(dt -> dt, "nanoseconds", "ns");

private static final Map<String, Part> NAME_TO_PART;
private static final List<String> VALID_VALUES;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially the findSimilar could be extracted into an interface with default methods (no need for static maps considering there's only one enum instance per ClassLoader).

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Left some comments.

}

if (left().foldable()) {
String truncateToValue = (String) left().fold();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this variable called like this? truncateToValue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of some refactoring, and extraction of an abstract class, fixing.

public Object fold() {
if (operation == TRUNC) {
return DateTruncProcessor.process(left().fold(), right().fold(), zoneId);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment here or for the entire class pointing out that, for the moment, this class has only two subclasses - datepart and datetrunc - thus, the else branch points directly to DatePartProcessor.

@Override
protected ScriptTemplate asScriptFrom(ScriptTemplate leftScript, ScriptTemplate rightScript) {
return new ScriptTemplate(
formatTemplate("{sql}." + (operation == TRUNC ? "dateTrunc" : "datePart") +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe refactor operation == TRUNC or, preferable, the entire TRUNC/PART logic in the subclasses themselves so that the abstract class only uses abstract methods that have definitions in the subclasses.

For example, in the asScriptFrom method don't use the operation == TRUNC ? "dateTrunc" : "datePart" logic, but delegate the decision to use either dateTrunc or datePart string to an abstract method belonging to both DatePart and DateTrunc classes where they return datePart and dateTrunc strings.

And I would say the same should be valid for fold() method where the same comparison is being made.

protected abstract Object doProcess(Object left, Object right);

public static BinaryDateTimeProcessor asProcessor(BinaryDateOperation operation, Processor left, Processor right, ZoneId zoneId) {
if (operation == TRUNC) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here. Why not delegating the choice of which processor to return to the subclasses themselves, as they are the decision makers. What if this abstract BinaryDateTimeProcessor will have another subclass in the future? It's weird to come back to the base class and change its implementation based on new subclasses that were added. Rather the subclasses should provide its specific implementation of a certain aspect of code and the base class should remain unchanged.

VALID_VALUES = DateTimeField.initializeValidValues(values());
}

private Set<String> aliases;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the definition order to be extractFunction, aliases to match the constructor parameters order and initialization.

if (source1 == null || source2 == null) {
return null;
}
if (!(source1 instanceof String)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer source1 instanceof String == false.

throw new SqlIllegalArgumentException("A value of {} or their aliases is required; received [{}]",
Part.values(), source1);
} else {
throw new SqlIllegalArgumentException("Received value [{}] is not valid date part for extraction; " + "" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an emtpy string concatenation here? + "" +

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over from splitting to new line.

}

if (source2 instanceof ZonedDateTime == false) {
throw new SqlIllegalArgumentException("A datetime/date is required; received [{}]", source2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation mentions date/datetime while the error message is about datetime/date. Small, picky comment, but I would try to be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries for "picky" comment, we should be consistent, thx for the catch.

public interface DateTimeField {

static <D extends DateTimeField> Map<String, D> initializeResolutionMap(D[] values) {
Map<String, D> nameToPart;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a single line Map<String, D> nameToPart = new HashMap<>(); instead of two lines?

newLoc,
b2.expression(),
b2.left(),
b2.right(),
b2.zoneId());
b2.zoneId(),
b1.operation());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't here be b2.operation()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! thx!

@matriv
Copy link
Contributor Author

matriv commented Oct 1, 2019

@astefan Thank you for the thorough review. Please check again.

@matriv matriv requested a review from astefan October 1, 2019 11:36
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matriv matriv merged commit ead743d into elastic:master Oct 1, 2019
@matriv matriv deleted the impl-46372 branch October 1, 2019 12:59
matriv added a commit that referenced this pull request Oct 1, 2019
DATE_PART(<datetime unit>, <date/datetime>) is a function that allows
the user to extract the specified unit from a date/datetime field
similar to the EXTRACT (<datetime unit> FROM <date/datetime>) but
with different names and aliases for the units and it also provides more
options like `DATE_PART('tzoffset', datetimeField)`.

Implemented following the SQL server's spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017
with the difference that the <datetime unit> argument is either a
literal single quoted string or gets a value from a table field, whereas
in SQL server keywords are used (unquoted identifiers) and it's not
possible to use a value coming for a table column.

Closes: #46372
(cherry picked from commit ead743d)
matriv added a commit to matriv/elasticsearch that referenced this pull request Oct 4, 2019
Add a 3rd argument to the DATE_PART function which is optional
and can take one of the `['Monday', 'Sunday']` values. If missing then
`Sunday` is used by default. This value is used for the `week` and
`weekday` extraction to allow for ISO-style calculation of the two
fields if the value `Monday` is used.

Follows: elastic#47206
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQL: add DATEPART function
5 participants