Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup Bulk Delete Exception Logging #41693

Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,7 @@ public void error(StorageException exception) {
if (e != null) {
throw new IOException("Exception when deleting blobs [" + failedBlobs + "]", e);
}
assert failedBlobs.isEmpty();
}

private static String buildKey(String keyPath, String s) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.CompleteMultipartUploadRequest;
import com.amazonaws.services.s3.model.DeleteObjectsRequest;
import com.amazonaws.services.s3.model.DeleteObjectsResult;
import com.amazonaws.services.s3.model.InitiateMultipartUploadRequest;
import com.amazonaws.services.s3.model.MultiObjectDeleteException;
import com.amazonaws.services.s3.model.ObjectListing;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.model.PartETag;
Expand All @@ -34,6 +36,7 @@
import com.amazonaws.services.s3.model.UploadPartRequest;
import com.amazonaws.services.s3.model.UploadPartResult;
import org.apache.lucene.util.SetOnce;
import org.elasticsearch.ExceptionsHelper;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.blobstore.BlobMetaData;
Expand All @@ -50,6 +53,8 @@
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

import static org.elasticsearch.repositories.s3.S3Repository.MAX_FILE_SIZE;
import static org.elasticsearch.repositories.s3.S3Repository.MAX_FILE_SIZE_USING_MULTIPART;
Expand Down Expand Up @@ -127,12 +132,13 @@ public void deleteBlobsIgnoringIfNotExists(List<String> blobNames) throws IOExce
if (blobNames.isEmpty()) {
return;
}
final Set<String> outstanding = blobNames.stream().map(this::buildKey).collect(Collectors.toSet());
try (AmazonS3Reference clientReference = blobStore.clientReference()) {
// S3 API only allows 1k blobs per delete so we split up the given blobs into requests of max. 1k deletes
final List<DeleteObjectsRequest> deleteRequests = new ArrayList<>();
final List<String> partition = new ArrayList<>();
for (String blob : blobNames) {
partition.add(buildKey(blob));
for (String key : outstanding) {
partition.add(key);
if (partition.size() == MAX_BULK_DELETES ) {
deleteRequests.add(bulkDelete(blobStore.bucket(), partition));
partition.clear();
Expand All @@ -146,21 +152,23 @@ public void deleteBlobsIgnoringIfNotExists(List<String> blobNames) throws IOExce
for (DeleteObjectsRequest deleteRequest : deleteRequests) {
try {
clientReference.client().deleteObjects(deleteRequest);
outstanding.removeAll(partition);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're removing "partition" here which is always the last partition, not the partition that is deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂ fixed :) sorry about that. Now using the keys from the delete request here (can't use the ones from the response, since it won't contain already missing blob names).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear Are you sure that missing blobs won't be reported in the response?
AWS S3 docs (https://docs.aws.amazon.com/en_us/AmazonS3/latest/API/multiobjectdeleteapi.html) say:

Note that, if the object specified in the request is not found, Amazon S3 returns the result as deleted.

If they're not reported, they won't be reported in MultiObjectDeleteException exception as well, and it would mean that exception handling logic is incorrect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here https://github.com/elastic/elasticsearch/pull/41693/files#diff-e915a7a019920d21471e0f39c435aa16R176. We have the quiet (non-verbose) option activated, so only keys that encounter an error are included. So for exceptions we're good imo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear sorry, I don't get it. If we're using quiet mode and only errors are reported it means we can not fetch the list of successfully deleted objects from MultiObjectDeleteException as well?
From MultiObjectDeleteException.getDeletedObjects JavaDoc

Returns the list of successfully deleted objects from this request. If
     * {@link DeleteObjectsRequest#getQuiet()} is true, only error responses
     * will be returned from s3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah fair point :D Added some more logic to account for this case now :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear good. Now, why we mess up with outstanding and performing remove/add to it? Why not just to add errors to the failedBlobs set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically just to cover running into some other runtime exception in the loop that is then caught in https://github.com/elastic/elasticsearch/pull/41693/files#diff-e915a7a019920d21471e0f39c435aa16R171 (just in case I guess ... not sure how likely/possible it is running into that).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear do you mean running into AmazonClientException? Ok, I see your point. Can you please add the comment to the method describing the design decision because of quiet mode and generic AmazonClientException?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

} catch (MultiObjectDeleteException e) {
outstanding.removeAll(
e.getDeletedObjects().stream().map(DeleteObjectsResult.DeletedObject::getKey).collect(Collectors.toList()));
aex = ExceptionsHelper.useOrSuppress(aex, e);
} catch (AmazonClientException e) {
if (aex == null) {
aex = e;
} else {
aex.addSuppressed(e);
}
aex = ExceptionsHelper.useOrSuppress(aex, e);
}
}
if (aex != null) {
throw aex;
}
});
} catch (final AmazonClientException e) {
throw new IOException("Exception when deleting blobs [" + blobNames + "]", e);
} catch (Exception e) {
throw new IOException("Failed to delete blobs [" + outstanding + "]", e);
}
assert outstanding.isEmpty();
}

private static DeleteObjectsRequest bulkDelete(String bucket, List<String> blobs) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1000,8 +1000,8 @@ protected void finalize(final List<SnapshotFiles> snapshots,
try {
blobContainer.deleteBlobsIgnoringIfNotExists(blobNames);
} catch (IOException e) {
logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to delete index blobs {} during finalization",
snapshotId, shardId, blobNames), e);
logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to delete index blobs during finalization",
snapshotId, shardId), e);
throw e;
}

Expand All @@ -1016,8 +1016,8 @@ protected void finalize(final List<SnapshotFiles> snapshots,
try {
blobContainer.deleteBlobsIgnoringIfNotExists(indexBlobs);
} catch (IOException e) {
logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to delete index blobs {} during finalization",
snapshotId, shardId, indexBlobs), e);
logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to delete index blobs during finalization",
snapshotId, shardId), e);
throw e;
}

Expand All @@ -1029,8 +1029,8 @@ protected void finalize(final List<SnapshotFiles> snapshots,
try {
blobContainer.deleteBlobsIgnoringIfNotExists(orphanedBlobs);
} catch (IOException e) {
logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to delete data blobs {} during finalization",
snapshotId, shardId, orphanedBlobs), e);
logger.warn(() -> new ParameterizedMessage("[{}][{}] failed to delete data blobs during finalization",
snapshotId, shardId), e);
}
} catch (IOException e) {
String message = "Failed to finalize " + reason + " with shard index [" + currentIndexGen + "]";
Expand Down