Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot Finalization Throws NPE in 7.5 #50200

Closed
original-brownbear opened this issue Dec 14, 2019 · 1 comment · Fixed by #50234
Closed

Snapshot Finalization Throws NPE in 7.5 #50200

original-brownbear opened this issue Dec 14, 2019 · 1 comment · Fixed by #50234
Assignees
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs

Comments

@original-brownbear
Copy link
Member

We're seeing the following NPE in Cloud logs:

Caused by: java.lang.NullPointerException
	at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.write(ChecksumBlobStoreFormat.java:226) ~[elasticsearch-7.5.0.jar:7.5.0]
	at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.writeTo(ChecksumBlobStoreFormat.java:197) ~[elasticsearch-7.5.0.jar:7.5.0]
	at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.write(ChecksumBlobStoreFormat.java:185) ~[elasticsearch-7.5.0.jar:7.5.0]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.lambda$finalizeSnapshot$26(BlobStoreRepository.java:738) ~[elasticsearch-7.5.0.jar:7.5.0]
	at org.elasticsearch.action.ActionRunnable$1.doRun(ActionRunnable.java:45) ~[elasticsearch-7.5.0.jar:7.5.0]
	... 5 more

during snapshot finalization. This seems like a bug where the cluster metadata does not contain an index for a name that was just snapshot (the deeper cause is that clusterMetaData.index(index.getName()) in BlobStoreRepository returns null). I'm investigating this.

@original-brownbear original-brownbear added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Dec 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@original-brownbear original-brownbear self-assigned this Dec 14, 2019
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Dec 14, 2019
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Closes elastic#50200
original-brownbear added a commit that referenced this issue Dec 16, 2019
With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Dec 16, 2019
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates elastic#50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Dec 16, 2019
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates elastic#50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit that referenced this issue Dec 16, 2019
With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit that referenced this issue Dec 16, 2019
* Fix Index Deletion during Snapshot Finalization (#50202)

With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Dec 16, 2019
We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
original-brownbear added a commit that referenced this issue Dec 17, 2019
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Dec 17, 2019
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
original-brownbear added a commit that referenced this issue Dec 17, 2019
We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this issue Jan 23, 2020
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates elastic#50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this issue Jan 23, 2020
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants