-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor writeObject to only use MultipartUpload when required #27877
Conversation
66762da
to
3e32ee6
Compare
throw $e; | ||
// if anything goes wrong with multipart, make sure that you don´t poison and | ||
// slow down s3 bucket with orphaned fragments | ||
$uploadInfo = $uploader->getState()->getId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote for adding this to the baseline, since while the abstract declaration is considered protected the implementation is a public method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it weird that they did not think of that, so I dug a little and:
can't we use that $e->getState()->getId();
?
This is from https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/s3-multipart-upload.html which actually recommend using ObjectUploader
to get this kind of optimization. But maybe I am not looking at the right thing ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'll check if that works as expected 👍
3e32ee6
to
1d67245
Compare
e4319e3
to
88fb846
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐘 tests 💪
Thanks Julius, to take the merge work over for me. Proud that my code will make it into product. |
88fb846
to
036edd0
Compare
What is the core problem being solved by not always using the multipart uploader? I'm not a fan of having to buffer the first chunk for every upload |
Besides the fact that the AWS classes themselves always use ObjectUploader for Objects smaller 5 MB, BTW: the buffer is only 5 MB and the code is an improved version of the original AWS code in the PHP library, |
Fair enough, thanks for the explanation |
036edd0
to
94814f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👍
Super nice to use multipart less often. I've added some nitpicks mostly about phpdoc and possible type hints. But not a blocker for merge.
Could we use a higher value for single put operations? Like 25 or 50 MB. I understand the buffer uses the memory hence we should not pick a value to high. However most pictures have more than 5 MB nowadays.
// buffer is fully seekable, so use it directly for the small upload | ||
$this->writeSingle($urn, $buffer, $mimetype); | ||
} else { | ||
$loadStream = new Psr7\AppendStream([$buffer, $psrStream]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We append buffer and psrStream because the psrStream might be non-seekable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
Signed-off-by: Bernd Rederlechner <[email protected]> Co-authored-by: Julius Härtl <[email protected]>
94814f9
to
3866f38
Compare
Will do a follow up PR for that to check about the memory usage in detail with different values. Edit: draft in #28542 |
All other addressed. |
We need the fragment optimisation to avoid S3 Slowdown on OTC, as described in nextcloud#27877 Signed-off-by: [email protected] <[email protected]>
We need the fragment optimisation to avoid S3 Slowdown on OTC, as described in nextcloud#27877 Signed-off-by: [email protected] <[email protected]>
Ensure that multipart uploads are only used when needed. For cases where the file is smaller than 5242880. Since streams may not have a fixed size before reading them some contortions performed by @tsdicloud makes sure that we read that amount of bytes into a buffer to determine if a simple putObject can be used or if the multipart upload is triggered with an AppendStream of the buffer and the rest of the original one.
Furthermore there is additional cleanup added to properly remove leftovers from failed multipart upload attempts.
I pulled out the changes of the branch with the s3 encryption and added some further test cases.