-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16949 Introduce inverse quantiles for metrics where higher numer… #5486
Conversation
…ic value is better
@@ -52,6 +52,7 @@ public class MutableQuantiles extends MutableMetric { | |||
new Quantile(0.75, 0.025), new Quantile(0.90, 0.010), | |||
new Quantile(0.95, 0.005), new Quantile(0.99, 0.001) }; | |||
|
|||
protected boolean inverseQuantiles = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private final boolean
You will also need to set it in the constructor.
For inverse quantiles higher numeric value is better and hence we want | ||
to query from the opposite end of the sorted sample | ||
*/ | ||
double effectiveQuantile = inverseQuantiles ? 1 - quantiles[i].quantile : quantiles[i].quantile; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this come after the next line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to pass the effectiveQuantile to the query function hence calculated before.
@@ -243,7 +245,12 @@ synchronized public Map<Quantile, Long> snapshot() { | |||
|
|||
Map<Quantile, Long> values = new TreeMap<Quantile, Long>(); | |||
for (int i = 0; i < quantiles.length; i++) { | |||
values.put(quantiles[i], query(quantiles[i].quantile)); | |||
/* eg : effectiveQuantile for 0.99 with inverseQuantiles will be 0.01. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaning into OOO: I might make an inversequantile class that overrides this function to keep it simple rather than overloading quantile with multiple definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
to query from the opposite end of the sorted sample | ||
*/ | ||
double effectiveQuantile = inverseQuantiles ? 1 - quantiles[i].quantile : quantiles[i].quantile; | ||
values.put(quantiles[i], query(effectiveQuantile)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't reversing list order traversal give you the same thing wihtout altering the math?
c69b9ca
to
c37a41f
Compare
c37a41f
to
a66f26a
Compare
* @param quantile Queried quantile, e.g. 0.50 or 0.99. | ||
* @return Estimated value at the inverse position of that quantile. | ||
*/ | ||
long query(double quantile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im assuming this logic looks similar to the SampleQuantile, is there any small refactor we could do to DRY, then maybe a couple lines below looks like
int desired = getQuantile(quantile)
with a default and overridden implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, that will be much cleaner. Let me make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Minimal change here is to inverse list-order traversal.
- The for loop is performing a Map operation.
The DRY method would be to encapsulate the for loop to a protected function "getQuantilesFromSamples".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to reverse the traversal order.
Overriding query()
method instead of creating a new method and moved out the common validation step outside.
@@ -118,4 +118,36 @@ public void testQuantileError() throws IOException { | |||
} | |||
} | |||
} | |||
|
|||
@Test | |||
public void testInverseQuantiles() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you just add a description of what the test is doing
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are trailing whitespace errors.
@@ -104,8 +105,7 @@ public MutableQuantiles(String name, String description, String sampleName, | |||
String.format(descTemplate, percentile)); | |||
} | |||
|
|||
estimator = new SampleQuantiles(quantiles); | |||
|
|||
estimator = inverseQuantiles ? new InverseQuantiles(quantiles) : new SampleQuantiles(quantiles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is estimator variable used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, estimator is the one which will sort the sample and populate quantiles.
* @param quantile Queried quantile, e.g. 0.50 or 0.99. | ||
* @return Estimated value at the inverse position of that quantile. | ||
*/ | ||
long query(double quantile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Minimal change here is to inverse list-order traversal.
- The for loop is performing a Map operation.
The DRY method would be to encapsulate the for loop to a protected function "getQuantilesFromSamples".
54f9851
to
4c3d079
Compare
💔 -1 overall
This message was automatically generated. |
4c3d079
to
a30980e
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Given that we merged #5495 should we close this one? |
Closing as a different approach was merged in PR #5495 |
…ic value is better
Description of PR
Currently quantiles are used for latencies, where lower numeric value is better.
Hence p90 gives us a value val(p90) such that 90% of our sample set has a value better (lower) than val(p90)
However for metrics such as calculating transfer rates (eg : HDFS-16917 ) higher numeric value is better. Thus for such metrics the current quantiles dont work.
For these metrics in order for p90 to give a value val(p90) where 90% of the sample set is better (higher) than val(p90) we need to inverse the selection by choosing a value at the (100 - 90)th location instead of the usual 90th position.
Note: There will be error guarantees for percentiles as our quantile implementation following Cormode, Korn, Muthukrishnan, and Srivastava algorithm
How was this patch tested?
Results from UT testInverseQuantiles()
Starting run 0
For quantile 0.500000 Expected 50000 with error 5000, estimated 50548
For quantile 0.750000 Expected 25000 with error 7500, estimated 27257
For quantile 0.900000 Expected 9999 with error 9000, estimated 12412
For quantile 0.950000 Expected 5000 with error 9500, estimated 8169
For quantile 0.990000 Expected 1000 with error 9900, estimated 6272
Starting run 1
For quantile 0.500000 Expected 50000 with error 5000, estimated 50896
For quantile 0.750000 Expected 25000 with error 7500, estimated 26421
For quantile 0.900000 Expected 9999 with error 9000, estimated 12908
For quantile 0.950000 Expected 5000 with error 9500, estimated 8101
For quantile 0.990000 Expected 1000 with error 9900, estimated 6099
Starting run 2
For quantile 0.500000 Expected 50000 with error 5000, estimated 50945
For quantile 0.750000 Expected 25000 with error 7500, estimated 26571
For quantile 0.900000 Expected 9999 with error 9000, estimated 12159
For quantile 0.950000 Expected 5000 with error 9500, estimated 7951
For quantile 0.990000 Expected 1000 with error 9900, estimated 5624
Starting run 3
For quantile 0.500000 Expected 50000 with error 5000, estimated 51442
For quantile 0.750000 Expected 25000 with error 7500, estimated 27493
For quantile 0.900000 Expected 9999 with error 9000, estimated 12775
For quantile 0.950000 Expected 5000 with error 9500, estimated 8242
For quantile 0.990000 Expected 1000 with error 9900, estimated 5967
Starting run 4
For quantile 0.500000 Expected 50000 with error 5000, estimated 50473
For quantile 0.750000 Expected 25000 with error 7500, estimated 25832
For quantile 0.900000 Expected 9999 with error 9000, estimated 11593
For quantile 0.950000 Expected 5000 with error 9500, estimated 7461
For quantile 0.990000 Expected 1000 with error 9900, estimated 5454
Starting run 5
For quantile 0.500000 Expected 50000 with error 5000, estimated 50826
For quantile 0.750000 Expected 25000 with error 7500, estimated 26946
For quantile 0.900000 Expected 9999 with error 9000, estimated 12894
For quantile 0.950000 Expected 5000 with error 9500, estimated 8087
For quantile 0.990000 Expected 1000 with error 9900, estimated 6481
Starting run 6
For quantile 0.500000 Expected 50000 with error 5000, estimated 51023
For quantile 0.750000 Expected 25000 with error 7500, estimated 26468
For quantile 0.900000 Expected 9999 with error 9000, estimated 12364
For quantile 0.950000 Expected 5000 with error 9500, estimated 7939
For quantile 0.990000 Expected 1000 with error 9900, estimated 6009
Starting run 7
For quantile 0.500000 Expected 50000 with error 5000, estimated 50793
For quantile 0.750000 Expected 25000 with error 7500, estimated 26849
For quantile 0.900000 Expected 9999 with error 9000, estimated 13093
For quantile 0.950000 Expected 5000 with error 9500, estimated 8006
For quantile 0.990000 Expected 1000 with error 9900, estimated 5701
Starting run 8
For quantile 0.500000 Expected 50000 with error 5000, estimated 51106
For quantile 0.750000 Expected 25000 with error 7500, estimated 26671
For quantile 0.900000 Expected 9999 with error 9000, estimated 12757
For quantile 0.950000 Expected 5000 with error 9500, estimated 8537
For quantile 0.990000 Expected 1000 with error 9900, estimated 6224
Starting run 9
For quantile 0.500000 Expected 50000 with error 5000, estimated 50825
For quantile 0.750000 Expected 25000 with error 7500, estimated 27698
For quantile 0.900000 Expected 9999 with error 9000, estimated 11594
For quantile 0.950000 Expected 5000 with error 9500, estimated 8113
For quantile 0.990000 Expected 1000 with error 9900, estimated 5475
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?