Failure in CloseWhileRelocatingShardsIT #44855

DaveCTurner · 2019-07-25T13:01:27Z

I am seeing occasional failures of CloseWhileRelocatingShardsIT of the following form:

Suite: Test class org.elasticsearch.indices.state.CloseWhileRelocatingShardsIT
  2> liep. 25, 2019 9:37:59 PRIEŠPIET com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
  2> WARNING: Uncaught exception in thread: Thread[elasticsearch[node_sd4][generic][T#5],5,TGRP-CloseWhileRelocatingShardsIT]
  2> java.lang.AssertionError: max seq. no. [-1] does not match [207]
  2>    at __randomizedtesting.SeedInfo.seed([AE1B2D049A25A3A3]:0)
  2>    at org.elasticsearch.index.engine.ReadOnlyEngine.assertMaxSeqNoEqualsToGlobalCheckpoint(ReadOnlyEngine.java:153)
  2>    at org.elasticsearch.index.engine.ReadOnlyEngine.ensureMaxSeqNoEqualsToGlobalCheckpoint(ReadOnlyEngine.java:144)
  2>    at org.elasticsearch.index.engine.ReadOnlyEngine.<init>(ReadOnlyEngine.java:113)
  2>    at org.elasticsearch.index.engine.NoOpEngine.<init>(NoOpEngine.java:54)
  2>    at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1595)
  2>    at org.elasticsearch.index.shard.IndexShard.recoverLocallyUpToGlobalCheckpoint(IndexShard.java:1411)
  2>    at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:176)
  2>    at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:552)
  2>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769)
  2>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
  2>    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  2>    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  2>    at java.base/java.lang.Thread.run(Thread.java:835)

  2> REPRODUCE WITH: ./gradlew :server:integTest --tests "org.elasticsearch.indices.state.CloseWhileRelocatingShardsIT.testCloseWhileRelocatingShards" -Dtests.seed=AE1B2D049A25A3A3 -Dtests.security.manager=true -Dtests.jvms=4 -Dtests.locale=lt-LT -Dtests.timezone=America/Buenos_Aires -Dcompiler.java=12 -Druntime.java=12

Doesn't reproduce every time at 6275cd7 but normally fails after only a few iterations. Backing up to 69c94f4 before the merge of #43463 this test passes reliably (tens of successful iterations and counting).

Relates #41536 as this failure is only on the peer-recovery-retention-leases feature branch(es).

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-07-25T13:01:30Z

Pinging @elastic/es-distributed

Relates #44855

DaveCTurner · 2019-07-25T13:07:06Z

Muted in 96dd543.

For closed and frozen indices, we should not recover shard locally up to the global checkpoint before performing peer recovery for that copy might be offline when the index was closed/frozen. Relates #43463 Closes #44855

dnhatn · 2019-07-30T17:35:04Z

Fixed in #44887

DaveCTurner added >test-failure Triaged test failures from CI :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. labels Jul 25, 2019

DaveCTurner assigned dnhatn Jul 25, 2019

DaveCTurner added a commit that referenced this issue Jul 25, 2019

AwaitsFix testCloseWhileRelocatingShards

96dd543

Relates #44855

dnhatn mentioned this issue Jul 25, 2019

Skip local recovery for closed or frozen indices #44887

Merged

dnhatn closed this as completed Jul 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure in CloseWhileRelocatingShardsIT #44855

Failure in CloseWhileRelocatingShardsIT #44855

DaveCTurner commented Jul 25, 2019

elasticmachine commented Jul 25, 2019

DaveCTurner commented Jul 25, 2019

dnhatn commented Jul 30, 2019

Failure in CloseWhileRelocatingShardsIT #44855

Failure in CloseWhileRelocatingShardsIT #44855

Comments

DaveCTurner commented Jul 25, 2019

elasticmachine commented Jul 25, 2019

DaveCTurner commented Jul 25, 2019

dnhatn commented Jul 30, 2019