Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: batch limit of azure.storage.blob.ContainerClient.delete_blobs() is poorly documented #22821

Closed
Gerrit-K opened this issue Jan 28, 2022 · 5 comments · Fixed by #22859
Closed
Assignees
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Docs needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team Storage Storage Service (Queues, Blobs, Files)

Comments

@Gerrit-K
Copy link

Gerrit-K commented Jan 28, 2022

I wanted to move an entire "folder" within a storage account and for unrelated reasons I had to use this python sdk for that. I did this in basically a 3-step action:

  1. List all blobs that match this "folder" in the pseudo-hierarchy
  2. Copy those blobs individually to the new destination
  3. Delete the source blobs, for which I was happy to see that there already was a batched method: azure.storage.blob.ContainerClient.delete_blobs() so I didn't have to individually delete each blob one by one.

Or so I thought ... the batch failed with a PartialBatchErrorException and when I analyzed the parts, I noticed that a request failed with error code ExceedsMaxBatchRequestCount. The thing is: this "max batch request count" was nowhere to be found - neither in the code documentation, nor anywhere on the azure limits documentation page. The only thing I found was this test case from the .NET SDK:

https://github.com/Azure/azure-sdk-for-net/blob/402b7b71c310bbe0cb1c49862ba33c19a026f97d/sdk/storage/Azure.Storage.Blobs.Batch/tests/BlobBatchClientTests.cs#L64-L70

The 257 there seemed a bit suspicious so I experimented a bit with chunking and indeed 256 seems to be the maximum number of blobs that can be passed to delete_blobs(). However, 256 as a number is nowhere to be found in the storage section of the python sdk. Why did it have to be so difficult to find anything about this limit?

Once you decide if and where this should be documented, I'd of course offer my help in contributing this documentation.

@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 28, 2022
@tjprescott tjprescott added the Storage Storage Service (Queues, Blobs, Files) label Jan 28, 2022
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jan 28, 2022
@tjprescott tjprescott added Client This issue points to a problem in the data-plane of the library. Docs labels Jan 28, 2022
@ghost ghost added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jan 28, 2022
@tjprescott
Copy link
Member

@jalauzon-msft can you take a look?

@jalauzon-msft
Copy link
Member

jalauzon-msft commented Jan 28, 2022

Hi @Gerrit-K, thanks for bringing this up and your investigation!

You are correct. The batch size for delete_blobs(), as well as the other batch APIs we support, is limited to 256 by the service.
https://docs.microsoft.com/en-us/rest/api/storageservices/blob-batch#request-body

I will create a PR to add this limit to the code documentation for all of our batch APIs. Once updated and released, the documentation will also be pushed to our online docs.

@jalauzon-msft jalauzon-msft added bug This issue requires a change to an existing behavior in the product in order to be resolved. and removed question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 28, 2022
@jalauzon-msft
Copy link
Member

In the future, we may want to consider having these APIs support looping through batches if the user specifies more than 256 items. Or at the very least, raising a helpful exception if more than 256 items are specified.

@Gerrit-K
Copy link
Author

Oof ... yeah I apparently didn't check the API docs 😓 And also sorry if I sounded a bit grumpy there. Thank's for picking this up and responding so quickly, much appreciated! I agree that automatically looping through the batches would be the most ideal solution, but the PR you've submitted is already really helpful!

@jalauzon-msft
Copy link
Member

I had a bit of a mishap with the previous PR. I've created another one combined with another doc update.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Docs needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants