Ensure metadata folder is not resurrected when loading latest state file #19338

ywelsch · 2016-07-08T15:10:38Z

If MetaDataStateFormat.loadLatestState() is called while the folder on which it operates is being deleted, it is possible that this method resurrects the folder. Possibly affects folders for shard state metadata, index metadata or global metadata.

Test failure showing the issue:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-os-compatibility/os=sles/729/console

Log snippets showing issue:

java.lang.AssertionError
  2>    at __randomizedtesting.SeedInfo.seed([63923DF9F1B5687D]:0)
  2>    at org.elasticsearch.env.NodeEnvironment.deleteShardDirectoryUnderLock(NodeEnvironment.java:491)
  2>    at org.elasticsearch.indices.IndicesService.deleteShardStore(IndicesService.java:664)
  2>    at org.elasticsearch.index.IndexService.onShardClose(IndexService.java:418)
  2>    at org.elasticsearch.index.IndexService.access$100(IndexService.java:97)
  2>    at org.elasticsearch.index.IndexService$StoreCloseListener.handle(IndexService.java:496)
  2>    at org.elasticsearch.index.IndexService$StoreCloseListener.handle(IndexService.java:481)
  2>    at org.elasticsearch.index.store.Store.closeInternal(Store.java:391)
  2>    at org.elasticsearch.index.store.Store.access$000(Store.java:119)
  2>    at org.elasticsearch.index.store.Store$1.closeInternal(Store.java:140)
  2>    at org.elasticsearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:64)
  2>    at org.elasticsearch.index.store.Store.decRef(Store.java:373)
  2>    at org.elasticsearch.index.store.Store.close(Store.java:381)
  2>    at org.elasticsearch.index.IndexService.closeShard(IndexService.java:403)
  2>    at org.elasticsearch.index.IndexService.removeShard(IndexService.java:375)
  2>    at org.elasticsearch.index.IndexService.close(IndexService.java:236)
  2>    at org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:504)
  2>    at org.elasticsearch.indices.IndicesService.deleteIndex(IndicesService.java:566)
  2>    at org.elasticsearch.indices.cluster.IndicesClusterStateService.deleteIndices(IndicesClusterStateService.java:244)
  2>    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:178)
  2>    at org.elasticsearch.cluster.service.ClusterService.runTasksForExecutor(ClusterService.java:691)
  2>    at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:855)
  2>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:450)
  2>    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:237)
  2>    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:200)
  2>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  2>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  2>    at java.lang.Thread.run(Thread.java:745)
  2> REPRODUCE WITH: gradle :core:integTest -Dtests.seed=63923DF9F1B5687D -Dtests.class=org.elasticsearch.indices.state.RareClusterStateIT -Dtests.method="testUnassignedShardAndEmptyNodesInRoutingTable" -Dtests.security.manager=true -Dtests.locale=mt-MT -Dtests.timezone=Asia/Brunei

and

 1> [2016-07-08 07:33:46,159][DEBUG][org.elasticsearch.gateway] [node_t0] /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-os-compatibility/os/sles/core/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_63923DF9F1B5687D-001/tempDir-001/d0/nodes/0/indices/CZHA3myYTsCy1yArn911ow/0/_state/state-0.st: failed to read [state-], ignoring...
  1> java.nio.file.NoSuchFileException: /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-os-compatibility/os/sles/core/build/testrun/integTest/J0/temp/org.elasticsearch.indices.state.RareClusterStateIT_63923DF9F1B5687D-001/tempDir-001/d0/nodes/0/indices/CZHA3myYTsCy1yArn911ow/0/_state/state-0.st
  1>    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
  1>    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
  1>    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
  1>    at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
  1>    at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212)
  1>    at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212)
  1>    at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212)
  1>    at org.apache.lucene.mockfile.HandleTrackingFS.newByteChannel(HandleTrackingFS.java:240)
  1>    at org.apache.lucene.mockfile.FilterFileSystemProvider.newByteChannel(FilterFileSystemProvider.java:212)
  1>    at org.apache.lucene.mockfile.HandleTrackingFS.newByteChannel(HandleTrackingFS.java:240)
  1>    at java.nio.file.Files.newByteChannel(Files.java:361)
  1>    at java.nio.file.Files.newByteChannel(Files.java:407)
  1>    at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77)
  1>    at org.elasticsearch.gateway.MetaDataStateFormat.read(MetaDataStateFormat.java:184)
  1>    at org.elasticsearch.gateway.MetaDataStateFormat.loadLatestState(MetaDataStateFormat.java:319)
  1>    at org.elasticsearch.index.shard.ShardPath.loadShardPath(ShardPath.java:115)
  1>    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:133)
  1>    at org.elasticsearch.gateway.TransportNodesListGatewayStartedShards.nodeOperation(TransportNodesListGatewayStartedShards.java:57)
  1>    at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:143)
  1>    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:268)
  1>    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:264)
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69)
  1>    at org.elasticsearch.transport.TransportService$5.doRun(TransportService.java:517)
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:510)
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  1>    at java.lang.Thread.run(Thread.java:745)

The failing assertion checks after successful shard deletion that the shard folder has gone. It fails as the folder reappears. The reason for this is that there is still a concurrent operation on the node loading shard state metadata (triggered by async fetching).

ywelsch · 2016-07-08T15:13:56Z

core/src/main/java/org/apache/lucene/store/SimpleReadOnlyFSDirectory.java

+ *
+ * Only supports the {@link Directory#openInput(String,IOContext)} method.
+ */
+public class SimpleReadOnlyFSDirectory extends BaseDirectory {


@s1monw I found this to be the simplest solution, especially as MetaDataStateFormat exposes the Directory for the tests (to wrap it in MockDirectoryWrapper).

mikemccand · 2016-07-11T09:52:20Z

It really is quite scary to have to fork Lucene's entire SimpleFSDirectory here! @s1monw suggested allowing a subclass to override the mkdir behavior on init in Lucene ... I'll explore this.

mikemccand · 2016-07-11T13:25:25Z

@s1monw suggested allowing a subclass to override the mkdir behavior on init in Lucene

I opened https://issues.apache.org/jira/browse/LUCENE-7375 for this

dakrone · 2016-09-12T22:09:36Z

@ywelsch is this stalled pending the linked Lucene change, or is it waiting on other review?

ywelsch · 2016-12-08T09:07:15Z

no progress on the Lucene end. Should we proceed with the current solution here?

bleskes · 2016-12-08T09:16:07Z

Maybe another route here is to use SimpleFSIndexInput directly, rather than going through a directory? we lose some testing assertions, but maybe it's worth it for this simple case? (read only, closing files in the same methods etc.

elasticmachine · 2017-02-23T18:14:41Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

imotov · 2017-03-24T18:36:52Z

Just want to mention that this test failed 5 times so far in March. The latest failure is here https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=ubuntu/619/console

ywelsch · 2017-04-04T14:14:46Z

Maybe another route here is to use SimpleFSIndexInput directly, rather than going through a directory?

@bleskes The issue is that SimpleFSIndexInput is only package-visible (although its constructor is public), see also #19338 (comment)

ywelsch · 2017-05-23T09:02:22Z

I think the easiest solution here would be to have Lucene expose SimpleFSIndexInput as public. We have been having test failures for more than a year now and it seems impossible to get this PR done, so I will just close it.

bleskes · 2018-01-27T18:00:11Z

@s1monw can you please take another look at this and see whether there's a good way to proceed?

If `ShardStateMetaData.FORMAT.loadLatestState` is called while a shard is closing, the shard metadata directory may be deleted after its existence has been checked but before the Lucene `Directory` has been created. When the `Directory` is created, the just-deleted directory is brought back into existence. There are three places where `loadLatestState` is called in a manner that leaves it open to this race. This change ensures that these calls occur either under a `ShardLock` or else while holding a reference to the existing `Store`. In either case, this protects the shard metadata directory from concurrent deletion. Cf elastic#19338, elastic#21463, elastic#25335 and https://issues.apache.org/jira/browse/LUCENE-7375

Ensure metadata folder is not resurrected when loading latest state file

c926ef8

ywelsch added >enhancement :Internal labels Jul 8, 2016

ywelsch reviewed Jul 8, 2016
View reviewed changes

Add copy of SimpleFSIndexInput

08428b4

ywelsch mentioned this pull request Apr 10, 2017

Tests: RareClusterStateIT. testUnassignedShardAndEmptyNodesInRoutingTable failed #21463

Closed

ywelsch closed this May 23, 2017

ywelsch mentioned this pull request Jan 26, 2018

[CI] UpdateNumberOfReplicasIT.testAutoExpandNumberOfReplicas0ToData fails #25335

Closed

DaveCTurner mentioned this pull request Mar 19, 2018

Avoid loading shard metadata while closing #29140

Closed

DaveCTurner mentioned this pull request Sep 27, 2018

[CI] IndexRecoveryIT.testRerouteRecovery : Paths exist that should have been deleted #32686

Closed

asfimport mentioned this pull request Jul 11, 2016

Can we allow FSDirectory subclasses to customize whether the ctor does a mkdir? [LUCENE-7375] apache/lucene#8428

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure metadata folder is not resurrected when loading latest state file #19338

Ensure metadata folder is not resurrected when loading latest state file #19338

ywelsch commented Jul 8, 2016

ywelsch Jul 8, 2016

mikemccand commented Jul 11, 2016

mikemccand commented Jul 11, 2016

dakrone commented Sep 12, 2016

ywelsch commented Dec 8, 2016

bleskes commented Dec 8, 2016

elasticmachine commented Feb 23, 2017

imotov commented Mar 24, 2017

ywelsch commented Apr 4, 2017

ywelsch commented May 23, 2017

bleskes commented Jan 27, 2018

Ensure metadata folder is not resurrected when loading latest state file #19338

Ensure metadata folder is not resurrected when loading latest state file #19338

Conversation

ywelsch commented Jul 8, 2016

ywelsch Jul 8, 2016

Choose a reason for hiding this comment

mikemccand commented Jul 11, 2016

mikemccand commented Jul 11, 2016

dakrone commented Sep 12, 2016

ywelsch commented Dec 8, 2016

bleskes commented Dec 8, 2016

elasticmachine commented Feb 23, 2017

imotov commented Mar 24, 2017

ywelsch commented Apr 4, 2017

ywelsch commented May 23, 2017

bleskes commented Jan 27, 2018