-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uses MergingDigest instead of AVLDigest in percentiles agg #28702
Conversation
Additionally to the benchmark I performed above, @danielmitterdorfer wrote a JMH benchmark test with the same methodology and ran it on our microbenchmark infrastructure. These benchmarks show less of a difference between the two algorithms but still show a 2-10 times improvement in speed for MergingDigest when compared to AVLTreeDigest:
It should be noted that |
@elastic/es-search-aggs |
This change modifies TDigestState to use MergingDigest instead of AVLTreeDigest. Benchmarks comparing the insertion performance of the two implementations show that MergingDigest is between 2 and 250 times faster than AVLTreeDigest in the tested scenarios. Details of the benchmark are below. The benchmark was performed on the Raw classes outside of Elasticsearch. Three scenarios were tested but in each case the test was warmed up by inserting 10,000 values into each implementation over 10 warm up runs and then measurements were taken inserting 1,000,000 values over 100 test runs for each implementation. For each test run the same values were used on both implementations. The compression was 100.0 in all tests. Measurements were taken by calling `System.nanoTime()` before and after inserting the 1,000,000 values. Measurements were combined into a mean average for the final results. 1. Random Value Doubles - Test values were doubles taken from `java.util.Random.nextDouble()`. For each test run the Random instance was seeded with the same seed for both implementations but the seed was changed for each test run. 2. Sequential Integer Values As Doubles - Test values were taken from an incrementing long variable and then converted to a double using `Double.valueOf(long)`. The long variable was reset for every test run and between testing each implementation. 3. Repeated Same Value Doubles - Test values were taken from `java.util.Random.nextDouble()` at the beginning of a test run and this same value was repeatedly inserted into the implementation under test. The value was kept the same between the two implementations but a new value was obtained between test runs. Average AVLTreeDigest (ns): 2.4393659506E8 Average AVLTreeDigest (ms): 243.93659506 Average MergingDigest (ns): 1.1975786487E8 Average MergingDigest (ms): 119.75786487 Average MergingDigest / Average AVLTreeDigest (%): 49.09384950648495 Average AVLTreeDigest / Average MergingDigest (raw value): 2.036915031215686 Average AVLTreeDigest (ns): 9.4348391258E8 Average AVLTreeDigest (ms): 943.48391258 Average MergingDigest (ns): 1.0778998414E8 Average MergingDigest (ms): 107.78998414 Average MergingDigest / Average AVLTreeDigest (%): 11.424676425615287 Average AVLTreeDigest / Average MergingDigest (raw value): 8.752983128326491 Average AVLTreeDigest (ns): 7.421614153E9 Average AVLTreeDigest (ms): 7421.614153 Average MergingDigest (ns): 2.755736346E7 Average MergingDigest (ms): 27.55736346 Average MergingDigest / Average AVLTreeDigest (%): 0.3713122629645282 Average AVLTreeDigest / Average MergingDigest (raw value): 269.315102069637521
Things still to do on this PR:
|
Since this PR is superseded by #35182 I'm going to close it |
This change modifies TDigestState to use MergingDigest instead of AVLTreeDigest. Benchmarks comparing the insertion performance of the two implementations show that MergingDigest is between 2 and 250 times faster than AVLTreeDigest in the tested scenarios. Details of the benchmark are below.
The change is straight forward but one thing which is still outstanding is to add a test which tests the assumption that the serialisation is compatible between older versions using AVLTreeDigest and new versions after this change using MergingDigest. This should be covered by bwc tests in theory but I would like to add an explicit unit test to check this as well.
The benchmark was performed on the Raw classes outside of Elasticsearch. Three scenarios were tested but in each case the test was warmed up by inserting 10,000 values into each implementation over 10 warm up runs and then measurements were taken inserting 1,000,000 values over 100 test runs for each implementation. For each test run the same values were used on both implementations. The compression was 100.0 in all tests. Measurements were taken by calling
System.nanoTime()
before and after inserting the 1,000,000 values. Measurements were combined into a mean average for the final results.Scenarios:
java.util.Random.nextDouble()
. For each test run the Random instance was seeded with the same seed for both implementations but the seed was changed for each test run.Double.valueOf(long)
. The long variable was reset for every test run and between testing each implementation.java.util.Random.nextDouble()
at the beginning of a test run and this same value was repeatedly inserted into the implementation under test. The value was kept the same between the two implementations but a new value was obtained between test runs.Results
Random Value Doubles
Average AVLTreeDigest (ns): 2.4393659506E8
Average AVLTreeDigest (ms): 243.93659506
Average MergingDigest (ns): 1.1975786487E8
Average MergingDigest (ms): 119.75786487
Average MergingDigest / Average AVLTreeDigest (%): 49.09384950648495
Average AVLTreeDigest / Average MergingDigest (raw value): 2.036915031215686
Sequential Integer Values As Doubles
Average AVLTreeDigest (ns): 9.4348391258E8
Average AVLTreeDigest (ms): 943.48391258
Average MergingDigest (ns): 1.0778998414E8
Average MergingDigest (ms): 107.78998414
Average MergingDigest / Average AVLTreeDigest (%): 11.424676425615287
Average AVLTreeDigest / Average MergingDigest (raw value): 8.752983128326491
Repeated Same Value Doubles
Average AVLTreeDigest (ns): 7.421614153E9
Average AVLTreeDigest (ms): 7421.614153
Average MergingDigest (ns): 2.755736346E7
Average MergingDigest (ms): 27.55736346
Average MergingDigest / Average AVLTreeDigest (%): 0.3713122629645282
Average AVLTreeDigest / Average MergingDigest (raw value): 269.315102069637521
Closes #19528