-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[oCIS FS] clean up aborted uploads #2622
Comments
cc @butonic |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions. |
I think that's something reva should do. I mean, I don't see any reason this couldn't happen with any other storage backend. Maybe other storage implementation keep the chunks in a different storage area, but that doesn't mean that such storage area couldn't be filled. I'm not sure about specific implementation details in reva, but I think it should provide a service that runs periodically to cleanup uploads after 1-2 days (configurable). A possible idea is that reva could have a dedicated thread to cleanup the uploads. |
Yeah, I think the clean solution would be to implement this in the storage driver. |
If it needs to be implemented as a command then we would either have to implement storage specific mechanisms in the command or include a generic 'clean up' API to the CS3 API to be able to replace the underlying storage. |
@C0rby I don't really understand how we can do this without a command (or an API). Should we just start a go-routine that cleans up upload data? Can you elaborate what you have in mind? |
Well, it could be the responsibility of every storage driver to implement a clean up mechanism. Then the storage driver itself would run an asynchronous goroutine or something else. |
Ah ok. But how would that go-routine know that it has something to do? It would just sleep for 10mins and then check? Then we could also have a command and trigger it via cronjob. Beneficial in that would be that we could also run it manually (on systems with low power for example) |
I'd say this depends on the implementation but yeah something like that.
Yeah I see the benefits. That also means that we need to add a |
Basically the upload is cleaned correctly (in success or error case or when the user aborts the upload). The upload is only left if the user pauses the upload (and never restarts) or maybe on some connection issues. So my point is: I'm not sure if this is worth having a go-routine running all the time. CPU time costs money also. But I'm fine with both approaches, both have their benefits. |
I see your points. I'd also say for the decomposedfs it would also be enough to have a cronjob running a bash script using standard tools since the decomposedfs are just plain files. But for other storages this might not be sufficient so we probably end up having to implement an API. |
I think we can keep it internal to the each storage implementation. When the storage starts, a go-routine can also start to perform the cleanup of the uploads. This go-routine will behave cron-like, so it will be slept for the most time and run each 15-30 minutes (or something configurable). The CPU usage should be close to 0 when it's sleeping. The communication between the storage and the go-routine will be via shared memory. The idea is that a list / map / custom object is shared between the storage implementation and the go-routine, so when the upload starts the required data is written in that shared object and it will be removed when the upload finished correctly. When the go-routine wakes up, it might see that there is an upload ongoing by checking the shared object, so depending on how much time the upload has been there it can decide to clean it up. The advantages I see are:
There might be some corner cases, but in general it seems a good idea. |
Please keep also in mind that we have multiple instances of the storage service running. So we have a good chance of running multiple cleanups in parallel then. Don't know if that could cause problems. |
It should be ok. The go-routine will monitor the uploads going through the storage instance that has spawned it. If there are multiple instances they won't overlap. The only problem is a crash-recovery scenario where the storage service crashed with GB of partial uploads. It might be a good idea if each instance has its own temporary directory for the uploads, that way each instance can either recover or remove their own failed uploads, maybe something like |
I'm not sure it should do that. It would need to persist data in case it gets restarted. I would suggest it just goes through all uploads and decides whether it should remove it or not through the .info file. |
I'd prefer to keep it as a recovery mechanism in case the service crashes. Without having any limitations around uploads nor users, there could be thousands of uploads at the same time. Having to scan all the uploads can take a lot of time, so I don't think we should do that by default. By using shared memory (I'm talking about a shared object between the instance and the go-routine, not something like IPC), the go-routine doesn't need to constantly check the FS, and all the info it needs is already in memory and accessible. In addition, as said, the algorithm can be quite efficient, leading to a low CPU usage by the go-routine. |
Uploads in oCIS should:
Since oCIS is a distributed system we cannot rely on shared memory. We only have shared storage. All state needs to live on the storage. We cannot keep things in memory. I would go for a manual command first. The cleanup tool needs to inspect the |
A command is ok if it's planned as a debugging tool, so if something goes wrong we know what the command is doing step by step. |
upload expiry is actually the TUS expiation extension that we should implement:
There is also the TUS Termination extension that allows clients to actively delete uploads. |
Outcome refinement: implement cli command using storage driver directly |
Is your feature request related to a problem? Please describe.
When using oCIS FS as users storage, you will have an
storage/users/uploads
directory in your oCIS data folder. This is an intermediate directory for resumable (TUS) uploads. Each upload consists of a blob file and a blob.info file. If an upload succeeds, the blob file will be moved to the target and the blob.info file will be deleted.Uploads can be aborted and the blob and blob.info files will not be (re)moved in that case. Therefore a lot of data can pile up.
Describe the solution you'd like
oCIS FS cleans up expired uploads by it self.
Describe alternatives you've considered
oCIS offers a command to clean expired uploads, which can then be used for manual maintenance or a regular clean up (eg. scheduled by cron)
Additional context
The text was updated successfully, but these errors were encountered: