-
Notifications
You must be signed in to change notification settings - Fork 917
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix an issue with tdigest merge aggregations. (#10506)
A case was found where merging tdigests that contain very small numbers of centroids with high weight could cause an invalid resulting tdigest to be generated. The issue was in the "gap fixup" code during the cluster generation step. The diff here is unfortunate but the change is fundamentally: Make sure we run through the gap-fixing code during the first pass (where we're just counting the # of buckets), and update the nearest weight (nearest_w) variable representing the centroid we just bucketed. <s>Leaving a Do Not Merge tag on here for now to get confirmation of fix from Spark team.</s> Authors: - https://github.com/nvdbaranec Approvers: - Yunsong Wang (https://github.com/PointKernel) - MithunR (https://github.com/mythrocks) URL: #10506
- Loading branch information
1 parent
54918d8
commit a4c450b
Showing
2 changed files
with
120 additions
and
40 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters