Fix a bug in the _upload_file_part_concurrent method
#910
+25
−10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
_upload_file_part_concurrent
method is used as part of theput_file
function to upload the file in multiple parts (when the file is larger than a certain limit).The function basically reads from the original file in chunks (by default 50MB) and then schedules 10 upload calls in one block. It has two different "branches": if there is more than one chunk left, it schedules them in parallel - if not, it just runs it directly.
This last branch has a bug: it uses a variable
chunk
which is actually defined in another scope (in the for-loop before it). This leads to wrong data on the remote location: if you upload a file which has e.g. between 20 * 50MB and 21 * 50MB size, it will always be truncated to to exactly 20 * 50MB on s3. This bug is fixed in this PR.