Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Failure in org.elasticsearch.search.query.QueryPhaseTests.testIndexHasDuplicateData #49703

Closed
dliappis opened this issue Nov 29, 2019 · 3 comments · Fixed by #49786
Closed
Assignees
Labels
:Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI

Comments

@dliappis
Copy link
Contributor

#testIndexHasDuplicateData and specifically this assertion has been failing since fa8b48d, as seen for example in this scan: https://gradle-enterprise.elastic.co/s/k3vldmv6cz4ac

    public void testIndexHasDuplicateData() throws IOException {
        int docsCount = 7000;
        int duplIndex = docsCount * 7 / 10;
        int duplIndex2 = docsCount * 3 / 10;
        long duplicateValue = randomLongBetween(-10000000L, 10000000L);
        Directory dir = newDirectory();
        IndexWriter writer = new IndexWriter(dir, new IndexWriterConfig(null));
        for (int docId = 0; docId < docsCount; docId++) {
            Document doc = new Document();
            long rndValue = randomLongBetween(-10000000L, 10000000L);
            long value = (docId < duplIndex) ? duplicateValue : rndValue;
            long value2 = (docId < duplIndex2) ? duplicateValue : rndValue;
            doc.add(new LongPoint("duplicateField", value));
            doc.add(new LongPoint("notDuplicateField", value2));
            writer.addDocument(doc);
        }
        writer.close();
        final IndexReader reader = DirectoryReader.open(dir);
        boolean hasDuplicateData = indexFieldHasDuplicateData(reader, "duplicateField");
        boolean hasDuplicateData2 = indexFieldHasDuplicateData(reader, "notDuplicateField");
        reader.close();
        dir.close();
        assertTrue(hasDuplicateData); <---- this trips
        assertFalse(hasDuplicateData2);
    }

I am able to reproduce this locally using ./gradlew ':server:test' --tests "org.elasticsearch.search.query.QueryPhaseTests.testIndexHasDuplicateData" -Dtests.seed=EA6BD7DFEF354E5A -Dtests.security.manager=true -Dtests.locale=en-ZA -Dtests.timezone=Etc/GMT+0 -Dcompiler.java=12.

@dliappis dliappis added :Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI labels Nov 29, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@spinscale
Copy link
Contributor

another failing seed (under osx for me): ./gradlew ':server:test' --tests "org.elasticsearch.search.query.QueryPhaseTests.testIndexHasDuplicateData" -Dtests.seed=7D0D1C76527A8E17 -Dtests.security.manager=true -Dtests.locale=zh-HK -Dtests.timezone=America/Porto_Velho -Dcompiler.java=12

@andreidan
Copy link
Contributor

another failing seed: ./gradlew ':server:test' --tests "org.elasticsearch.search.query.QueryPhaseTests.testIndexHasDuplicateData" \ -Dtests.seed=A6A1AFA6B34C16A4 \ -Dtests.security.manager=true \ -Dtests.locale=en-US \ -Dtests.timezone=Singapore \ -Dcompiler.java=12

mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Dec 2, 2019
mayya-sharipova added a commit that referenced this issue Dec 2, 2019
mayya-sharipova added a commit that referenced this issue Dec 2, 2019
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Dec 2, 2019
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.

This rewrites tests, to make the  max points in leaf node to
be a small value to control the tests.

Closes elastic#49703
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Dec 30, 2019
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.

This rewrites tests, to make the  max points in leaf node to
be a small value to control the tests.

Closes elastic#49703
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this issue Jan 23, 2020
mayya-sharipova added a commit that referenced this issue Mar 19, 2020
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.

This rewrites tests, to make the  max points in leaf node to
be a small value to control the tests.

Closes #49703
mayya-sharipova added a commit that referenced this issue Mar 19, 2020
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.

This rewrites tests, to make the  max points in leaf node to
be a small value to control the tests.

Closes #49703
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants