Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MultiValuesSource support to Aggregation Framework #18285

Closed

Conversation

nknize
Copy link
Contributor

@nknize nknize commented May 11, 2016

This PR refactors core ValuesSourceAggregatorBuilder, Factory, and Parser as base classes to new {Single | Multi} ValuesSourceAggregatorBuilder, Factory, and Parser classes. This provides support for developing new aggregations that operate on multiple fields as is needed by the MultiFieldStats aggregation discussed in #16817.

Existing single field aggregations are refactored to extend SingleValuesSourceAggregator and a follow on PR will add a new matrix aggregation module which includes the MultiFieldStats aggregation reviewed in PR #16826.

@nknize
Copy link
Contributor Author

nknize commented May 11, 2016

@colings86, would you mind taking a look at this if you get a chance? It takes the initial work you did with MultiValuesSource but removes the code duplication.

@nknize nknize force-pushed the feature/MultiValuesSource_Agg_Support branch from 9fe180a to 0079dcc Compare May 11, 2016 21:31
return false;
}
}
} else if ("field".equals(currentFieldName)) {
Copy link
Contributor

@colings86 colings86 May 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ParseField here? In fact, could we take this opportunity to update the SingleValuesSourceParser and ValuesSourceParser to use ParseField too, as well as their builder toXContent methods?

@colings86
Copy link
Contributor

@nknize I left some comments but I like what you have so far

nknize added 2 commits May 12, 2016 14:14
Refactor ValuesSourceAggregatorBuilder, Factory, and Parser as base classes to new {Single | Multi} ValuesSourceAggregatorBuilder, Factory, and Parser classes. Refactor existing single value aggregations to extend SingleValuesSourceAggregator support classes.
…ommon code from read and createBuilder methods to base class.
@nknize nknize force-pushed the feature/MultiValuesSource_Agg_Support branch from 0079dcc to 49760ab Compare May 12, 2016 22:30
@nknize
Copy link
Contributor Author

nknize commented May 12, 2016

Thanks @colings86. First round review changes have been made.

It looks like there are a lot of duplicate ParseField static instances like: ParseField("field) I went ahead and put some of the common ones as static variables in the ParseField class, but haven't removed the duplicates throughout the code. I figured this could be done in a separate PR if this is the preferred approach?

public static final ParseField FORMAT_FIELD = new ParseField("format");
public static final ParseField MISSING_FIELD = new ParseField("missing");
public static final ParseField TIME_ZONE_FIELD = new ParseField("time_zone");
public static final ParseField VALUE_TYPE_FIELD = new ParseField("value_type", "valueType");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure camelCase shouldn't be supported any more.

Copy link
Contributor

@colings86 colings86 May 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but I would check with @clintongormley whether we are happy making this breaking change in 5.0. However, we haven't actually formally removed camelCase in ParseField IIRC so adding this extra name here might be unnecessary anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was removed from ParseField in #17933. It also shouldn't be added here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok great, thanks for the correction @rjernst

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set camelCase valueType here because it wasn't removed from the original parser: https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/search/aggregations/support/AbstractValuesSourceParser.java#L121

I thought it best to set as deprecated here but based on the feedback I presume its OK to simply remove it?

@colings86
Copy link
Contributor

@nknize thanks for making the changes. This looks pretty good now. I left a comment about where the ParseField constants are put because I don't think that they are in the right place.

Also, I know we are going to have an implementation of a MultiValuesSourceAgg in a module but since the infrastructure is in core I think we should at least do basic tests on the infrastructure to make sure it works in core rather than relying on the module. Maybe we should have a Unit test (that extends BaseAggregationTestCase) and an integration test that register a mock or very simple MultiValueSource aggregation and then tests to make sure all the bits of the multi source agg infrastructure works? So in the unit test it checks that the MultiValuesSourceAggregatorBuilder parses and renders to the same thing, hashCode and equals works and the wire format works correctly, and for the integration test it makes sure the options set in the builder are propogated properly through to the aggregator and it returns something sensible? Alternatively to the integration test maybe we just need unit tests for the Parser and Factory?

@nknize
Copy link
Contributor Author

nknize commented May 13, 2016

After talking with @colings86 and @jpountz we've decided to close this PR in favor of the following:

  • Leave ValuesSource classes in core alone (regardless of code duplication)
  • Put MultiValuesSource support classes in the aggs-matrix-stats module (PR New Matrix Stats Aggregation module #18300) until another module or plugin requires them
  • Remove script support from MultiValuesSource aggregations (until the design can be thought through)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants