-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ocdav: upload file to storage provider after assembling chunks #1253
Conversation
bb49b9f
to
bc22212
Compare
@ishank011 let me know when is non wip |
bc22212
to
41762bf
Compare
41762bf
to
c28cf60
Compare
@labkode @butonic the reason for the memory leaks in tus uploads was that the tus client expects an io.ReadSeeker object to be passed but accepts io.Reader as well, which it tries to typecast by copying the whole thing to an in-memory buffer declared on the heap. This stays in memory until the next invocation of the garbage collector, which causes reva to crash due to lack of free memory. A few solutions I tried are:
The solution which does work is writing the content to a temporary file and passing the file descriptor to that. It introduces a bit of added latency but almost doesn't add at all to the memory overhead, so I think this would be the best solution. Please let me know your thoughts. |
I’ll make the change to other places where we initiate the tus client as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not write temporary files for every upload and always stream it to the datagateway. See the comment above.
Also you are mixing changes to reuse the httpclient with fixing the chunking ... please send two PRs ... makes it easier to review and get smaller things in asap.
fileName, fd, err := s.createChunkTempFile() | ||
if err != nil { | ||
w.WriteHeader(http.StatusInternalServerError) | ||
return | ||
} | ||
defer os.RemoveAll(fileName) | ||
defer fd.Close() | ||
if _, err := io.Copy(fd, r.Body); err != nil { | ||
w.WriteHeader(http.StatusInternalServerError) | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this you are writing the file twice.
once on the machine that is runnung the dataprovider and once when writing it at the destination.
The whole point of tus and resumable uploads is to not touch the disk of any intermediate services. It is too slow.
The storage provider is responsible for putting any new bytes as close as possible to the target destination, so that the final assembly does not have to copy the bytes across storage boundaries.
A simple PUT request is the worst case because ill the bytes need to make it through ... and it cannot be resumed. Which makes it problematic for large files. That might be the reason why you increased the timeout in the gateway by 10 min.
Let me assure you that there are end users trying to upload 80GB ... in the browser ... which in the past used PUT ... to a machine with a 100mbit connection ... to the nas ...
TUS goes a long way to deal with this. it disconnects from the request context so it can write as many bytes as possible. That is the reason why we are having a hard time getting debug level logging from actual uploads.
Internally, handlePut should just read the body directly and us TUS to send it to the datagateway/dataprovider.
now ... chunking v1 is a different thing. Did you check if you can translate chinking v1 to TUS? because AFAICT we should be able to internally append the incoming PUT requests for every chunk to a single TUS upload to the datagateway/provider.
But to get things going I would be ok to just collect the chunks in ocdav and then do the upload using TUS internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the multiple changes. Let's focus only on the tus part.
With the current /remote.php/webdav
PUT endpoint, we're not supporting resumable uploads anyway. We get the whole body at once (80 GB), store that in memory (the tus client does that for the reader to readseeker conversion), and once the transfer is complete, persist that 80 GB allocation until the next run of the garbage collector.
So with the current implementation, PUTting 80 GB won't be successful unless you have 80 GB of memory. If you're able to provide that 80 GB, for the next PUT call, you'll have to allocate another 80 GB. This is the part where the problem creeps in https://github.com/eventials/go-tus/blob/45c7ec8f5d595db5dd50a926fc8420a2c70404fb/upload.go#L87-L89
buf := new(bytes.Buffer)
buf.ReadFrom(reader)
stream = bytes.NewReader(buf.Bytes())
On my 4GB VM, I'm not able to write even 3 GB files without reva crashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the POST handler works, but that assumes that the client itself knows tus and is sending binary chunks by itself, which won't be the case in a browser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch ...
hm, ok so we consider PUT requests legacy ... because they cannot really be implemented without this hack ...
The owncloud clients (including phoenix) will either learn TUS or already have, so for legacy clients this might be ok.
@@ -293,16 +294,26 @@ func (s *svc) descend(ctx context.Context, client gateway.GatewayAPIClient, src | |||
return fmt.Errorf("status code %d", httpDownloadRes.StatusCode) | |||
} | |||
|
|||
fileName, fd, err := s.createChunkTempFile() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above ... this copies the file to disk before sending it out again ... just read from the body
Ok, we should merge this and optimize handling of many small files in a subsequent PR. For files under 2MB we don't need to create temp files ... or maybe 10MB |
also, with temp files, uploading large files requires more free disk space ... which is another reason for using tus in the first place. so if anyone tries to upload large files .... tell him to use tus ... ;-) |
Sure, we can optimize it for smaller uploads. Will add that part as well. Thanks @butonic! |
@ishank011 any update? I'm ok to merge this as is ... we can improve performance for small files in a subsequent PR |
@butonic running a few more tests. Will merge it then |
72e6e9c
to
13a882e
Compare
e543531
to
74bd3e3
Compare
74bd3e3
to
fef229a
Compare
@phil-davis this PR fixes the old chunking, so I removed the corresponding tests from the expected failures here fef229a. Please update these in ocis-reva as well. There are some tests which still fail due to not being able to clear the log so maybe we can fix those as well https://github.com/owncloud/core/blob/master/tests/acceptance/features/apiWebdavUpload2/uploadFileUsingOldChunking.feature#L45 |
Closes #860