-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bounds inflation unit tests #92901
Bounds inflation unit tests #92901
Conversation
Pinging @elastic/es-analytics-geo (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment, you need to be very careful with bounding boxes crossing the dateline.
final double maxY = Math.min(wAvg(bbox.top(), boundsOfCells.getMaxY(), factor), 90d); | ||
final double left = GeoUtils.normalizeLon(wAvg(bbox.left(), boundsOfCells.getMinX(), factor)); | ||
final double right = GeoUtils.normalizeLon(wAvg(bbox.right(), boundsOfCells.getMaxX(), factor)); | ||
if (right - left >= 360d) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right as we are normalizing the latitudes, it will always be false?
This is one of the tricky parts here. In the current approach we compute the width taking into account that the bounding box might cross the dateline.
Then we can compute if inflating the bounding box will actually exceed 360d:
2 * factor * width + width(bbox) >= 360d
In your approach, detecting this case does not seem trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've since written some unit tests, and they do indeed fail. I'll see if the fix is easy enough.
781ab72
to
30a4319
Compare
The BoundedGeoHexGridTiler inflates the bounding box for parent cell searches since the parent cells bounds do not match the descendents. This algorithm does a slightly more refined calculation of the inflation factor by considering all four corners and inflating in the four directions independently.
30a4319
to
4d98839
Compare
The new algorithm was faster than the old for the same inflation factor but had more failures, and a longer tail of failures (which needed higher inflation factors). So it would take a lot more work (probably) to find and fix these edge cases. An easier option is to keep the original algorithm.
This increases performance by about 10% (a factor of 0.22 increased performance by about 13%). We found the error rate for `testGeoShapeGeoHex` was about 0.5% for inflation factor 0.19 and 0.2% for inflation factor 0.22. These last 2/1000 failed tests required inflation factors of 0.27 0.30 respectively, so we'll investigate them individually.
Since there were three failures discovered for 0.25, all of which passed at 0.30, we decided to set the inflation factor to 0.35 initially for the sake of caution, and consider revising it down again later with more careful testing.
* This is done by taking the H3 cells at two corners, and expanding the filter width | ||
* by 50% of the max width of those cells, and filter height by 50% of the max height of those cells. | ||
* | ||
* The inflation factor of 50% has been verified using test GeoHexTilerTests#testLargeShapeWithBounds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not correct any longer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are talking about the last line, I changed that from tuned
to verified
, as I thought that was correct. But would you prefer we remove the line entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more that the inflation factor is not 50% any more but 35%?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course. I fixed that now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Originally this PR added an alternative algorithm to BoundedGeoHexGridTiler, but performance tests showed that it was not actually better than the original, so now the PR only adds a number of unit tests.