Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] TransformSurvivesUpgradeIT failure #68646

Closed
davidkyle opened this issue Feb 8, 2021 · 3 comments
Closed

[CI] TransformSurvivesUpgradeIT failure #68646

davidkyle opened this issue Feb 8, 2021 · 3 comments
Assignees
Labels
:ml/Transform Transform Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@davidkyle
Copy link
Member

Build scan:
https://gradle-enterprise.elastic.co/s/evsustaeeg5gs

Repro line:

 ./gradlew ':x-pack:qa:rolling-upgrade:v7.5.2#bwcTest' \
  -Dtests.class="org.elasticsearch.upgrades.TransformSurvivesUpgradeIT" \
  -Dtests.method="testTransformRollingUpgrade" \
  -Dtests.seed=6EE3BC23BF91038B \
  -Dtests.security.manager=true \
  -Dtests.bwc=true \
  -Dtests.locale=da \
  -Dtests.timezone=America/Matamoros \
  -Druntime.java=8

Reproduces locally?:
No

Applicable branches:
7.x

Failure history:
This is the only instance of this particular failure I could find.

Failure excerpt:

java.lang.AssertionError: expected:<1> but was:<2>
	at __randomizedtesting.SeedInfo.seed([6EE3BC23BF91038B:C19E899A14D7D04F]:0)
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotEquals(Assert.java:834)
	at org.junit.Assert.assertEquals(Assert.java:118)
	at org.junit.Assert.assertEquals(Assert.java:144)
	at org.elasticsearch.upgrades.TransformSurvivesUpgradeIT.lambda$awaitWrittenIndexerState$9(TransformSurvivesUpgradeIT.java:293)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1005)
	at org.elasticsearch.upgrades.TransformSurvivesUpgradeIT.awaitWrittenIndexerState(TransformSurvivesUpgradeIT.java:286)
	at org.elasticsearch.upgrades.TransformSurvivesUpgradeIT.verifyContinuousTransformHandlesData(TransformSurvivesUpgradeIT.java:251)
	at org.elasticsearch.upgrades.TransformSurvivesUpgradeIT.testTransformRollingUpgrade(TransformSurvivesUpgradeIT.java:157)
@davidkyle davidkyle added >test-failure Triaged test failures from CI :ml/Transform Transform labels Feb 8, 2021
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Feb 8, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@hendrikmuhs
Copy link

I checked the logs and could not find anything fishy. This test basically tests the migration to a new transform version, which means the state document should be deleted from the old index. In the logs we have this line (after the upgrade):

[2021-02-07T09:58:46,545][TRACE][o.e.x.t.t.ClientTransformIndexer] [v7.5.2-0] [continuous-transform-upgrade-job] deleted old transform stats and state document

That means cleaning should have been executed.

The test is an assertBusy with a 60s timeout, this seems a lot, however the logs have tons of errors in it, maybe it was a bad build?

Unfortunately the logs and test feedback is insufficient, it would be good to:

  • improve the assert with a proper message containing the search response
  • log the number of deleted documents

As action I would improve the above and investigate further after it failed again. I checked build stats and this test rarely fails with this error.

@hendrikmuhs hendrikmuhs self-assigned this Mar 3, 2021
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this issue Mar 3, 2021
hendrikmuhs pushed a commit that referenced this issue Mar 4, 2021
report number of deleted state docs in order to help tracing issues

relates #68646
hendrikmuhs pushed a commit that referenced this issue Mar 4, 2021
report number of deleted state docs in order to help tracing issues

relates #68646
@hendrikmuhs
Copy link

I checked buildstats and could not find this failure again. All failures that also reported TransformSurvivesUpgradeIT failed due to some cluster instability.

Closing for now. If it happens again the added logging hopefully helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml/Transform Transform Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants