-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aggregation support for geo_shape fields #50834
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit introduces a new data-structure for reading and writing EdgeTrees that write/read serialized versions of the tree. This tree is the basis of Polygon trees that will contain representation of any holes in the more complex polygon
The GeometryTree represent an Elastisearch Geometry object. This includes collections like MultiPoint and GeometryCollection. For the initial implementation, only polygons without holes are supported. In a follow-up PR, the GeometryTree will be the object that interacts with doc-value reading and writing.
- min and max values of coordinates were difficult to track, this fixes that by introducing a new Extent object - Instead of re-wrapping ByteRef into a StreamInput, a stream input is made once - a new getExtent() method is introduced for use by aggregations like geo_bounds - re-use bounding-box containment checks
* Add GeometryTree support for point/multipoint This commit adds support for MultiPoint and Point shapes to be stored in GeometryTree. To represent the collection of points, a KDbush is used, which is a sorted array sorted recursively by alternating dimensions x/y. This work is inspired by https://github.com/mourner/kdbush The purpose of this reader is to check whether any subset of the points in the kd-tree are contained within the bounding-box query. * unify reader interface and cleanup multipoint usage * respond to review
The main change here is that edge-trees originally checked whether the queried extent could be contained within its shape. Since line-strings have no inner boundaries, this check is not useful, the line crosses check + extent-check-bounds is sufficient.
To aid in keeping aggregation logic as simple as possible, the MultiGeoPointValues object that returns GeoPoint values for fields from doc-values is updated to return implementations of a geo-value object that can represent either points or shapes.
Lucene removed GeoRelationUtils, and so this commit inlines ES's usage of this utiity class.
…2020) * Fix and document tiling semantics for shapes This commit resolves an issue in the geogrid shape tiler 1. fixes geohash brute-force-tiling to be equivalent to recursive geohash tiling 2. Resolves geotile tiling so that shapes outside of the geotile bounds are discarded 3. TriangleTree#relate is changed to be a specific relation against tiles such that intersections of tiles on the southern and western bounds of the shape are counted * more cleanup * in silico * fix a few more edge cases and mute tests for more debugging - Extent -> BoundingBox had a bug where 180/-180 and 90/-90 were treated as infinities. - awaitfixed a few edge-case tests - added muted test for checking that tile hashes of points along a tile reflect the same tiles returned by the tiler's setValues * fix checkstyle
Due to how geometries are encoded, it is important to compare the bounds of a shape to that of the encoded latitude bounds for geo-tiles.
This commit reflects comments made by Adrien in #50834 surrounding the Extent serialization. it re-orders and negates a few values in order to save more space
This commit modifies the centroid-calculator/dimensional-shape-type to properly support the instances of polygons that have no area and lines that have no length. Beforehand N/A were returned for the centroid values, but it is best to downcast the shape type to the appropriate type. Closes #52303
This PR adds support for the `doc_values` field mapping parameter. `true` and `false` supported by the GeoShapeFieldMapper, only `false` is supported by the LegacyGeoShapeFieldMapper. relates #37206
talevy
added a commit
that referenced
this pull request
Feb 24, 2020
This commit reflects comments made by Adrien in #50834 surrounding the Extent serialization. it re-orders and negates a few values in order to save more space
there are times where small triangle areas within a polygon have really small areas 1e-11, while the whole polygon's area is zero. This results in an infinite valuation of the centroid point representing that triangle. This commit ignores the addition of such values Addresses #52774
This PR cleans up some aspects of GeoShapeCellValues to support the specialization of bounded geo_shape geo-grid aggregations. This refactor reverts some of the BoundedCellValues constructs. Instead, BoundedGeoTileGridTiler and BoundedGeoHashGridTiler are introduced. As part of this change, the definition/semantics of geo_grid aggs with bounds on geo_point are modified to match the same behavior as geo_shapes, where it is the tile of the point that must intersect the bounds in order for the point to be accounted for
Closing in favor of individual PRs |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces doc-values support for
geo_shape
fields.This includes aggregation support for the following existing
geo_point
aggregations:The
geo_distance
aggregation is the only one not supported in this PR.Scripting support is also not implemented.
Closes #37206.