Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Aggregate Expressions #5961

Merged
merged 6 commits into from
Oct 25, 2022
Merged

Conversation

huaxingao
Copy link
Contributor

add Aggregate Expressions. These will be used for push down Max/Min/Count to Iceberg
Here is the original PR

@huaxingao
Copy link
Contributor Author

cc @rdblue @aokolnychyi @flyrain
This PR is ready for review. Could you please take a look when you have a moment?

case MIN:
return "min(" + term() + ")";
default:
return "Aggregate is not supported: operation = " + op();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw UnsupportedOperationException instead. It isn't correct to return a string like this in an expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!


Max max = new Max(namedReference);
Expression expectedMax = Expressions.max(unquoted);
Expression acturalMax = SparkAggregates.convert(max);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "actural" -> "actual"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

@rdblue
Copy link
Contributor

rdblue commented Oct 23, 2022

Thanks, @huaxingao! This is very close. Just a few minor things to fix.

@huaxingao
Copy link
Contributor Author

@rdblue Thank you very much for your review! I have addressed the comments. Could you please take one more look when you have time? Thanks!


@Test
public void testAggregateBinding() {
for (Expression.Operation op : AGGREGATES) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since AGGREGATES is only used here, I'd probably recommend refactoring so that you have a list of expressions for this loop:

  private static final List<Expression> = ImmutableList.of(Expressions.count("x"), Expressions.max("x"), Expressions.min("x"));
  ...
  for (Expression unbound : AGGREGATES) {
    ...
  }

@rdblue rdblue merged commit 8271791 into apache:master Oct 25, 2022
@rdblue
Copy link
Contributor

rdblue commented Oct 25, 2022

Thanks, @huaxingao!

@huaxingao
Copy link
Contributor Author

Thanks a lot! @rdblue I will refactor AGGREGATES to a list of expressions in my next PR.

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to update ExpressionParser + TestExpressionParser with this new functionality?

try {
unbound.bind(struct, false);
Assert.fail("Binding a missing field should fail");
} catch (ValidationException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use Assertions.assertThatThrownBy here rather than a try-catch with Assert.fail?

}

private static <T, C> BoundAggregate<T, C> assertAndUnwrapAggregate(Expression expr) {
Assert.assertTrue(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be worth to also update this to something like Assertions.assertThat(expr).isInstanceOf(BoundAggregate.class)

@huaxingao huaxingao deleted the aggregate_expression branch October 27, 2022 00:18
@huaxingao
Copy link
Contributor Author

@nastra Thanks for your review! I have a follow-up PR to address your comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants