Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored geogrid to support multiple hash types #30320

Closed
wants to merge 18 commits into from
Closed
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,19 @@
field3: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q3 }


---
"Verify geo aggregations work during upgrade":
- do:
search:
index: geo_agg_index
body:
aggregations:
mygrid:
geohash_grid:
field : location
precision : 1
- match: { hits.total: 6 }
- match: { aggregations.mygrid.buckets.0.key: u }
- match: { aggregations.mygrid.buckets.0.doc_count: 6 }
Original file line number Diff line number Diff line change
Expand Up @@ -198,3 +198,46 @@
tasks.get:
wait_for_completion: true
task_id: $task

---
"Create geo data records in the old cluster":
- do:
indices.create:
index: geo_agg_index
body:
settings:
index:
number_of_replicas: 0
mappings:
doc:
properties:
location:
type: geo_point
- do:
bulk:
refresh: true
body:
- '{"index": {"_index": "geo_agg_index", "_type": "doc"}}'
- '{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}'
- '{"index": {"_index": "geo_agg_index", "_type": "doc"}}'
- '{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}'
- '{"index": {"_index": "geo_agg_index", "_type": "doc"}}'
- '{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}'
- '{"index": {"_index": "geo_agg_index", "_type": "doc"}}'
- '{"location": "51.222900,4.405200", "name": "Letterenhuis"}'
- '{"index": {"_index": "geo_agg_index", "_type": "doc"}}'
- '{"location": "48.861111,2.336389", "name": "Musée du Louvre"}'
- '{"index": {"_index": "geo_agg_index", "_type": "doc"}}'
- '{"location": "48.860000,2.327000", "name": "Musée Orsay"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check here that a search returns the expected response so if it fails on the mixed cluster test we know the problem is to do with the upgrade and not because something went wrong on the old cluster before the upgrade?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

- do:
search:
index: geo_agg_index
body:
aggregations:
mygrid:
geohash_grid:
field : location
precision : 1
- match: { hits.total: 6 }
- match: { aggregations.mygrid.buckets.0.key: u }
- match: { aggregations.mygrid.buckets.0.doc_count: 6 }
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,19 @@
wait_for_completion: true
task_id: $task_id
- match: { task.headers.X-Opaque-Id: "Reindexing Again" }

---
"Verify geo aggregations work after upgrade with new types":
- do:
search:
index: geo_agg_index
body:
aggregations:
mygrid:
geohash_grid:
hash_type: geohash
field : location
precision : 1
- match: { hits.total: 6 }
- match: { aggregations.mygrid.buckets.0.key: u }
- match: { aggregations.mygrid.buckets.0.doc_count: 6 }
36 changes: 19 additions & 17 deletions server/src/main/java/org/elasticsearch/common/geo/GeoUtils.java
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.common.xcontent.XContentParser.Token;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.common.xcontent.support.XContentMapValues;
import org.elasticsearch.index.fielddata.FieldData;
import org.elasticsearch.index.fielddata.GeoPointValues;
import org.elasticsearch.index.fielddata.MultiGeoPointValues;
Expand Down Expand Up @@ -548,23 +547,26 @@ private static GeoPoint parseGeoHash(GeoPoint point, String geohash, EffectivePo
* @return int representing precision
*/
public static int parsePrecision(XContentParser parser) throws IOException, ElasticsearchParseException {
XContentParser.Token token = parser.currentToken();
if (token.equals(XContentParser.Token.VALUE_NUMBER)) {
return XContentMapValues.nodeIntegerValue(parser.intValue());
} else {
String precision = parser.text();
return parser.currentToken() == Token.VALUE_NUMBER ? parser.intValue() : parsePrecisionString(parser.text());
}

/**
* Attempt to parse geohash precision string into an integer value
*/
public static int parsePrecisionString(String precision) {
try {
// we want to treat simple integer strings as precision levels, not distances
return checkPrecisionRange(Integer.parseInt(precision));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change here. We can only do this in 7.0.0 but I suspect you are going to want to backport this PR to the 6.x branch. I think we should change this here to warn if the precision is out of range using the deprecation logger and then we can have a follow up PR for 7.0 only that makes this strictly within the range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure there are changes from the previous algorithm? I simply broke up the parsePrecision(XContentParser) into two functions, and inlined it to remove impossible code paths. In the previous version, checkPrecisionRange() was called at https://github.com/elastic/elasticsearch/pull/30320/files?utf8=%E2%9C%93&diff=split&w=1#diff-e98438cd3baeeca821694343df88218dL69 -- PARSER.declareField((parser, builder, context) -> builder.precision(parsePrecision(parser)), ... -- which means that if you supply "99999" value, it would fail in the builder.precision() call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a breaking change because if the string in the JSON was a simple integer value before we would parse it to an int using parser.intValue() and not perform a range check. Now we do perform a range check so JSON that used to be parsed without error will now throw an exception, hence its a breaking change. I do agree that throwing an error is better but we need to be careful because this can make upgrades tricky for the user if suddenly requests that they used fine in their app before start failing.

The question is, what used to happen is you gave a precision value outside of the range? Did we fail later in execution (in which case the breaking change here might be ok)? Did we accept the precision value and process it at the precision specified even though its outside the range? or may we accepted the precision value and if it was outside the range we used it as if it was at one of the bounds (i.e. if the value is above the MAX_PRECISION we just evaluated it as if it was MAX_PRECISION)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colings86 i think you misunderstood my comment - the existing code already throws an error if precision is out of range --

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, in which case I agree that this is not a breaking change

// checkPrecisionRange could also throw IllegalArgumentException, but let it through
// to keep errors somewhat consistent with how they were shown before this change
} catch (NumberFormatException e) {
// try to parse as a distance value
final int parsedPrecision = GeoUtils.geoHashLevelsForPrecision(precision);
try {
// we want to treat simple integer strings as precision levels, not distances
return XContentMapValues.nodeIntegerValue(precision);
} catch (NumberFormatException e) {
// try to parse as a distance value
final int parsedPrecision = GeoUtils.geoHashLevelsForPrecision(precision);
try {
return checkPrecisionRange(parsedPrecision);
} catch (IllegalArgumentException e2) {
// this happens when distance too small, so precision > 12. We'd like to see the original string
throw new IllegalArgumentException("precision too high [" + precision + "]", e2);
}
return checkPrecisionRange(parsedPrecision);
} catch (IllegalArgumentException e2) {
// this happens when distance too small, so precision > 12. We'd like to see the original string
throw new IllegalArgumentException("precision too high [" + precision + "]", e2);
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.elasticsearch.search.aggregations.bucket.filter.FiltersAggregator.KeyedFilter;
import org.elasticsearch.search.aggregations.bucket.geogrid.GeoGridAggregationBuilder;
import org.elasticsearch.search.aggregations.bucket.geogrid.GeoHashGrid;
import org.elasticsearch.search.aggregations.bucket.geogrid.GeoHashType;
import org.elasticsearch.search.aggregations.bucket.global.Global;
import org.elasticsearch.search.aggregations.bucket.global.GlobalAggregationBuilder;
import org.elasticsearch.search.aggregations.bucket.histogram.DateHistogramAggregationBuilder;
Expand Down Expand Up @@ -237,7 +238,7 @@ public static HistogramAggregationBuilder histogram(String name) {
* Create a new {@link GeoHashGrid} aggregation with the given name.
*/
public static GeoGridAggregationBuilder geohashGrid(String name) {
return new GeoGridAggregationBuilder(name);
return new GeoGridAggregationBuilder(name, GeoHashType.DEFAULT);
}

/**
Expand Down
Loading