Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Fix org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/110_field_collapsing/field collapsing and search_after} flaky test #7873

Closed
reta opened this issue Jun 1, 2023 · 7 comments · Fixed by #7988
Assignees
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run Search Search query, autocomplete ...etc

Comments

@reta
Copy link
Collaborator

reta commented Jun 1, 2023

Describe the bug
The org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/110_field_collapsing/field collapsing and search_after} test case is flaky.

java.lang.AssertionError: Failure at [search/110_field_collapsing:254]: expected [4xx|5xx] status code but api [search] returned [200 OK] [{"took":3,"timed_out":false,"_shards":{"total":2,"successful":1,"skipped":1,"failed":1,"failures":[{"shard":0,"index":"test","node":"Y3WKi7AWToWkPSGJCPpURA","reason":{"type":"search_exception","reason":"cannot use `collapse` in conjunction with `search_after`"}}]},"hits":{"total":0,"max_score":0.0,"hits":[]}}]
	at __randomizedtesting.SeedInfo.seed([9A807F988B26D31A:12D4404225DABEE2]:0)
	at org.opensearch.test.rest.yaml.OpenSearchClientYamlSuiteTestCase.executeSection(OpenSearchClientYamlSuiteTestCase.java:460)
	at org.opensearch.test.rest.yaml.OpenSearchClientYamlSuiteTestCase.test(OpenSearchClientYamlSuiteTestCase.java:433)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750

To Reproduce

./gradlew ':qa:mixed-cluster:v2.9.0#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/110_field_collapsing/field collapsing and search_after}" -Dtests.seed=9A807F988B26D31A

Expected behavior
Test should always pass

Plugins
Standard

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • CI

Additional context

@reta reta added bug Something isn't working flaky-test Random test failure that succeeds on second run labels Jun 1, 2023
@reta reta self-assigned this Jun 1, 2023
@andrross andrross added the Search Search query, autocomplete ...etc label Jun 1, 2023
@sejli sejli removed the untriaged label Jun 7, 2023
@sejli sejli moved this from 🆕 New to 🏗 In progress in Search Project Board Jun 8, 2023
@mingshl
Copy link
Contributor

mingshl commented Jun 8, 2023

Reproduce in main branch, return test success

./gradlew ':qa:mixed-cluster:v2.9.0#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/110_field_collapsing/field collapsing and search_after}" -Dtests.seed=9A807F988B26D31A -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=es-EC -Dtests.timezone=Asia/Dubai -Druntime.java=11

 [2.9.0] BUILD SUCCESSFUL in 3m 46s
 [2.9.0] 168 actionable tasks: 165 executed, 3 from cache

BUILD SUCCESSFUL in 4m 13s
186 actionable tasks: 22 executed, 164 up-to-date

@reta
Copy link
Collaborator Author

reta commented Jun 8, 2023

Interesting, I am able to consistently reproduce it:

./gradlew ':qa:mixed-cluster:v2.9.0#mixedClusterTest' --tests "org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/110_field_collapsing/field collapsing and search_after}" -Dtests.seed=9A807F988B26D31A
                                                                                                                                                                                                                                                                                                                            
Tests with failures:                                                                                                                                                                                                                                                                                                                                  
 - org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search/110_field_collapsing/field collapsing and search_after}  

Will to look at the cause

@mingshl
Copy link
Contributor

mingshl commented Jun 8, 2023

The only parameter that I change is the java version, which I changed from 19 to 11. Thinking that might relate to the failure.

@mingshl
Copy link
Contributor

mingshl commented Jun 8, 2023

I also tried running the same index, document and query, it works as expected with 500 errors in my local setting. You might also want to test it out

create mapping:

curl -XPUT localhost:9200/test --data '{
  "mappings": {
    "properties": {
          "numeric_group": {
            "type": "integer"
          },
          "tag": {
            "type": "keyword"
        }
    }
  }
} ' -H "Content-Type:Application/json"

upload documents:

curl -XPUT localhost:9200/test/_doc/1 -d '{ "numeric_group": 1, "tag": "A", "sort": 10 }' -H "Content-Type:Application/json"

curl -XPUT localhost:9200/test/_doc/2 -d ' { "numeric_group": 1, "tag": "B", "sort": 6 }' -H "Content-Type:Application/json"

curl -XPUT localhost:9200/test/_doc/3 -d ' { "numeric_group": 1, "tag": "A", "sort": 24 }' -H "Content-Type:Application/json"

curl -XPUT localhost:9200/test/_doc/4 -d ' { "numeric_group": 25, "tag": "B", "sort": 10 }' -H "Content-Type:Application/json"

curl -XPUT localhost:9200/test/_doc/5 -d ' { "numeric_group": 25, "tag": "A", "sort": 5 }' -H "Content-Type:Application/json"

curl -XPUT localhost:9200/test/_doc/6 -d ' { "numeric_group": 3, "tag": "B", "sort": 36 }' -H "Content-Type:Application/json"

field collapsing and search_after

curl 'localhost:9200/test/_search?rest_total_hits_as_int=true&pretty' --data '
{
  "size": 10,
  "collapse": {
    "field": "numeric_group"
  },
  "search_after": [6],
  "sort": [
    {
      "sort": "desc"
    }
  ]
}' -H "Content-Type:Application/json"

returning:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "search_exception",
        "reason" : "cannot use `collapse` in conjunction with `search_after`"
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "test",
        "node" : "a5JPq8g_TT6oNKaIcg-3ng",
        "reason" : {
          "type" : "search_exception",
          "reason" : "cannot use `collapse` in conjunction with `search_after`"
        }
      }
    ]
  },
  "status" : 500
}

@reta
Copy link
Collaborator Author

reta commented Jun 8, 2023

Oh yeah - local has no issues, but this is not what the test does - there is a mixed cluster , the -Druntime.java=XXX also does not change the behaviour for me - consistent test failure

@reta
Copy link
Collaborator Author

reta commented Jun 8, 2023

I was able to reproduce it, the server (2.9.0) indeed return 200 OK:

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 1,
        "skipped": 1,
        "failed": 1,
        "failures": [
            {
                "shard": 0,
                "index": "test",
                "node": "XLL2vfQ_Q0Ocrc0-ii5oQg",
                "reason": {
                    "type": "search_exception",
                    "reason": "cannot use `collapse` in conjunction with `search_after`"
                }
            }
        ]
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": 0.0,
        "hits": []
    }
}

@reta
Copy link
Collaborator Author

reta commented Jun 9, 2023

Here it the cause: flipping allow_partial_search_results default was causing the search request to not report an error but return just partial results instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run Search Search query, autocomplete ...etc
Projects
Archived in project
4 participants