Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang when downloading a large blob #25358

Closed
ktaebum opened this issue Jul 22, 2022 · 7 comments
Closed

Hang when downloading a large blob #25358

ktaebum opened this issue Jul 22, 2022 · 7 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. issue-addressed Workflow: The Azure SDK team believes it to be addressed and ready to close. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Storage Storage Service (Queues, Blobs, Files)

Comments

@ktaebum
Copy link

ktaebum commented Jul 22, 2022

  • Package Name: azure-storage-blob
  • Package Version: 12.13.0
  • Operating System: Ubuntu 20.04 x86
  • Python Version: 3.10

Describe the bug
I think this is related to #10572,
I am trying to download a blob whose size is 24GB.
I use download_blob of BlobClient and I set max_concurrency as 32 on my azure VM (VM size is Standard_D4ds_v5).

Expected behavior
I expect the downloading to be completed successfully.

Screenshots
Downloading is hanged as the following screenshot.
Ignore MB, it is B (bytes)
스크린샷 2022-07-22 오전 11 18 29

Additional context
This is a heisenbug. Sometimes downloading is finished successfully.

I've seen that the previous issue is fixed as https://github.com/Azure/azure-sdk-for-python/pull/18164/files.
However, I think it would be better if a user can configure max_retry which is fixed as 3 currently.

@ghost ghost added customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jul 22, 2022
@ktaebum ktaebum changed the title Hang when downloading large blob Hang when downloading a large blob Jul 22, 2022
@xiangyan99 xiangyan99 added Storage Storage Service (Queues, Blobs, Files) Client This issue points to a problem in the data-plane of the library. CXP Attention labels Jul 22, 2022
@ghost ghost added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Jul 22, 2022
@ghost
Copy link

ghost commented Jul 22, 2022

Thank you for your feedback. This has been routed to the support team for assistance.

@jalauzon-msft
Copy link
Member

Hi @ktaebum, thanks for reaching out and sorry for the delay. A couple of follow-up questions/points.

  • How long do you wait when the download starts hanging? Or do you ever get an error returned? I ask because we have a setting called read_timeout which has very high default value of 80,000 seconds (we know this isn't great and will likely be changing it soon). So, if the server has stopped sending data, that's how long we will wait before raising an error. You can try configuring this read_timeout on your client constructor to something more reasonable and waiting to see if you then get an error we can investigate further. See README.
  • The retry logic in the issue you linked is only for response streaming which I'm not sure is taking place here. We have a different, higher-level, configurable retry policy that can be adjusted. See this section of the README for more details on that.
  • It may be interesting to enable debug logging if possible so we can see more details on the requests going out and maybe find the request that's hanging. You can see this page for more info but there's a bit there so here's a sample of one way you could enable the logging.
import sys
import logging
from azure.storage.blob import BlobClient

# Set the logging level for the azure.storage.blob library
logger = logging.getLogger('azure.storage.blob')
logger.setLevel(logging.DEBUG)

# Direct logging output to stdout. Without adding a handler,
# no logging output is visible.
handler = logging.StreamHandler(stream=sys.stdout)
logger.addHandler(handler)

blob_client = BlobClient(..., logging_enable=True)

Ultimately, this is likely a server-side issue but let's try and gather some more info before involving the service team. Thanks!

@ktaebum
Copy link
Author

ktaebum commented Jul 28, 2022

@jalauzon-msft Thanks for the reply.

I waited not much time (just a couple of minutes) and no error returned.
If I set read_timeout as 10 seconds, I've seen
스크린샷 2022-07-28 오전 10 31 47

However, I've checked that downloading does not fail if I set retry_total as a very large number.
Thanks for letting me know about logging. I will try it if I have another problem.

@ktaebum
Copy link
Author

ktaebum commented Jul 28, 2022

Unfortunately, Read timed out still occurs (not always) when I set retry_total as a very large number (about 10000000) 🙁

@jalauzon-msft
Copy link
Member

jalauzon-msft commented Aug 27, 2022

Hi @ktaebum, apologies for the long delay. Read timeouts will be automatically retried by the SDK and it seems, from the screenshot you shared, this did help for that particular download as you see it time out and then continue. Changing the retry count will not eliminate read timeouts but will change the number of times a read timeout can be retried. Are you still seeing downloads not complete because of read timeouts? If they are completing after a read timeout, then the retry mechanism is working as expected and you should be good.

I would recommend setting your read_timeout to 60 seconds and your retry_total to something reasonable like 5-10. These are both changes we are planning to make to the defaults in the SDK in an upcoming release.

If you've done all this and are still having trouble downloading blobs and experiencing so many read timeouts where the blob will not complete downloading, I would recommend opening a support ticket for your Storage account to have the service team investigate further. Thanks!

@jalauzon-msft jalauzon-msft added the issue-addressed Workflow: The Azure SDK team believes it to be addressed and ready to close. label Sep 1, 2022
@ghost
Copy link

ghost commented Sep 1, 2022

Hi @ktaebum. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text “/unresolve” to remove the “issue-addressed” label and continue the conversation.

@ghost ghost removed the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Sep 1, 2022
@ghost
Copy link

ghost commented Sep 8, 2022

Hi @ktaebum, since you haven’t asked that we “/unresolve” the issue, we’ll close this out. If you believe further discussion is needed, please add a comment “/unresolve” to reopen the issue.

@ghost ghost closed this as completed Sep 8, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Apr 11, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. issue-addressed Workflow: The Azure SDK team believes it to be addressed and ready to close. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

4 participants