Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The downloaded file won't be truncated to correct size after sdk/storage/azblob:v1.2.0 #21995

Closed
Malsourie opened this issue Nov 16, 2023 · 4 comments · Fixed by #22036
Closed
Assignees
Labels
AzBlob Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team Storage Storage Service (Queues, Blobs, Files)

Comments

@Malsourie
Copy link

Bug Report

  • What happened?
    The downloaded file will not be truncated after method DownloadFile under azure-sdk-for-go/sdk/storage/azblob/blockblob/client.go is called. Therefore the md5 and size of the downloaded file does not match the original blob in container.
  • What did you expect or want to happen?
    The file size and md5 should be the same between the downloaded file and the original one in the container.
  • How can we reproduce it?
    Call the method DownloadFile and compare the sha/md5 and the size of the file.
@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-triage Workflow: This issue needs the team to triage. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Storage Storage Service (Queues, Blobs, Files) labels Nov 16, 2023
@jhendrixMSFT jhendrixMSFT removed question The issue doesn't require a change to the product in order to be resolved. Most issues start as that needs-team-triage Workflow: This issue needs the team to triage. labels Nov 16, 2023
@github-actions github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Nov 16, 2023
@vibhansa-msft
Copy link
Member

Can you provide exact test case that you are doing. Why do you expect downloaded file to be truncated? downloaded file shall match exactly the one in container by default.

@vibhansa-msft
Copy link
Member

Can you also share the file-size in local and on download.

@Malsourie
Copy link
Author

Can you provide exact test case that you are doing.

Test environment: Azure Storage Account with a test container.

  1. Upload a dummy file, with size 8241066B to the container via Azure Portal
  2. Download the file via Azure Portal to compare_directory
  3. rename the downloaded file origin
  4. Write a simple go script, which uses sdk/storage/azblob v1.1.0 to download that blob to compare_directory, by calling method blockblob.NewClientWithSharedKeyCredential(...).DownloadFile(...)
  5. rename the downloaded file 1_1_0
  6. do ls -la <compare_directory>
-rw-r--r--   1 user dialout 8241066 Nov 20 07:54 1_1_0
-rw-r--r--   1 user dialout 8241066 Nov 20 07:50 origin

They have the same size, everything works as expected.
7. Then use the same go script, but update the sdk/storage/azblob to v1.2.0, then call method blockblob.NewClientWithSharedKeyCredential(...).DownloadFile(...) to download the same blob
8. rename the downloaded file 1_2_0
9. do ls -la <compare_directory>

-rw-r--r--   1 user dialout 8241066 Nov 20 07:54 1_1_0
-rw-r--r--   1 user dialout 8388608 Nov 20 08:06 1_2_0
-rw-r--r--   1 user dialout 8241066 Nov 20 07:50 origin

As you can see the file downloaded by v1.2.0 has a different size.

Why do you expect downloaded file to be truncated? downloaded file shall match exactly the one in container by default.
As I have seen in the algorithm:

  1. In the comment of method DownloadFile inblockblob/client.go, it is announced that The file would be truncated if the size doesn't match.
  2. In DownloadFileOptions struct in blob/models.go, there is one option called BlockSize. My assumption is that the algorithm will first allocate blocks and then download the content into blocks in parallel. Since the file size normally will not match num_of_blocks times size_of_block, in the end the algorithm needs to truncate the file to the original size.
  3. There seems to be a truncate algorithm in blob/client.go, but does not work in the new version v1.2.0.

@souravgupta-msft
Copy link
Member

@Malsourie, thanks for reporting this issue. The fix has been merged and will be part of the next release.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AzBlob Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
5 participants