-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tests] Relax allowed delta in extended_stats aggregation #27171
[Tests] Relax allowed delta in extended_stats aggregation #27171
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue appears only when we add an incremental reduce in InternalAggregationTestCase#testReduceRandom
.
In this case the internal aggregations are shuffled and we do the reduce in two steps. This breaks the global ordering of the reduce and causes all sorts of issues when using doubles. Could we instead fix the base class to preserve the order like this:
diff --git a/test/framework/src/main/java/org/elasticsearch/test/InternalAggregationTestCase.java b/test/framework/src/main/java/org/elasticsearch/test/InternalAggregationTestCase.java
index fa342ee468..ee258a8c06 100644
--- a/test/framework/src/main/java/org/elasticsearch/test/InternalAggregationTestCase.java
+++ b/test/framework/src/main/java/org/elasticsearch/test/InternalAggregationTestCase.java
@@ -231,15 +231,14 @@ public abstract class InternalAggregationTestCase<T extends InternalAggregation>
MockBigArrays bigArrays = new MockBigArrays(Settings.EMPTY, new NoneCircuitBreakerService());
if (randomBoolean() && toReduce.size() > 1) {
// sometimes do an incremental reduce
- Collections.shuffle(toReduce, random());
int r = randomIntBetween(1, toReduceSize);
List<InternalAggregation> internalAggregations = toReduce.subList(0, r);
InternalAggregation.ReduceContext context =
new InternalAggregation.ReduceContext(bigArrays, mockScriptService, false);
@SuppressWarnings("unchecked")
T reduced = (T) inputs.get(0).reduce(internalAggregations, context);
- toReduce = new ArrayList<>(toReduce.subList(r, toReduceSize));
- toReduce.add(reduced);
+ toReduce = toReduce.subList(r, toReduceSize);
+ toReduce.add(0, reduced);
}
InternalAggregation.ReduceContext context =
new InternalAggregation.ReduceContext(bigArrays, mockScriptService, true);
This way we can change all the deltas to 0
and remove the true negatives that could hide a bug ?
6da39f2
to
c7887cf
Compare
@jimczi thanks for the review, I changed the PR according to your suggestions by removing the random shuffling. However, looking at the commit that first introduced the randomization (f933f80#diff-8e77ff5eaf3bae1ff401adfc0507766cR60), I'm wondering if we are still testing the right thing here then. When I read that commit right, this branch of the reduce test is there to simulate the behaviour of the aggregtion reduce phase in a real cluster where some shard aggregation results are reduced before others, which also breaks ordering. So getting the small deltas when doing this for many double values is something we also see in the real system behaviour, shouldn't we try to account for that in tests also? To me e.g. it was important to see in tests that changing the ordering here affects the result slightly, something that we also need to explain (and detect if it changes significantly). I'm happy to merge, but I'm rather for allowing small deltas here and be aware of the behaviour that I think also happens in non-unit test scenarios. |
c7887cf
to
2559164
Compare
@jimczi could you take another look at this? I changed the tests according to your suggestions, but the more I think about it I'd prefer going back to the previous option of having the randomization and allowing (and testing) small deviations). Maybe we should get another opinion on this? That said, I'm fine either way, just let me know what you think about my reasons stated above. |
I don't have a strong opinion on this but I don't like that we need a delta to test a simple sum. |
I completely agree, but IMHO the purpose of this test is not just testing a sum but to test the behaviour we get on a real multi-shard aggregation where we get those fluctuations with incremental reduction.
I don't think this works because internally we use double values in the aggregations. I will try though. |
Unfortunately this doesn't seem to work, we store the |
Yes sorry I meant small integers but this is also broken. We already bound the doubles that we randomly create to -1M, 1M so I am ok with the small delta change to fix this issue for the moment. |
The order in which double values are added in java can give different results for the sum, so we need to allow a certain delta in the test assertions. The current value was still a bit too low, which manifested itself in occasional test failures.
2559164
to
8e4a2d2
Compare
The order in which double values are added in java can give different results for the sum, so we need to allow a certain delta in the test assertions. The current value was still a bit too low, which manifested itself in occasional test failures.
The order in which double values are added in java can give different results for the sum, so we need to allow a certain delta in the test assertions. The current value was still a bit too low, which manifested itself in occasional test failures.
* es/master: (24 commits) Reduce synchronization on field data cache add json-processor support for non-map json types (#27335) Properly format IndexGraveyard deletion date as date (#27362) Upgrade AWS SDK Jackson Databind to 2.6.7.1 Stop responding to ping requests before master abdication (#27329) [Test] Fix POI version in packaging tests Allow affix settings to specify dependencies (#27161) Tests: Improve size regex in documentation test (#26879) reword comment Remove unnecessary logger creation for doc values field data [Geo] Decouple geojson parse logic from ShapeBuilders [DOCS] Fixed link to docker content Plugins: Add versionless alias to all security policy codebase properties (#26756) [Test] #27342 Fix SearchRequests#testValidate [DOCS] Move X-Pack-specific Docker content (#27333) Fail queries with scroll that explicitely set request_cache (#27342) [Test] Fix S3BlobStoreContainerTests.testNumberOfMultiparts() Set minimum_master_nodes to all nodes for REST tests (#27344) [Tests] Relax allowed delta in extended_stats aggregation (#27171) Remove S3 output stream (#27280) ...
The order in which double values are added in java can give different results
for the sum, so we need to allow a certain delta in the test assertions. The
current value was still a bit too low, which manifested itself in occasional
test failures.
e.g. the CI failure that triggered this PR:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/4835/testReport/junit/org.elasticsearch.search.aggregations.metrics/InternalExtendedStatsTests/testReduceRandom/