-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retain generated corpus and pretranslation files for a build #468
Comments
This is needed to support gray box testing by the test team. |
Do we want this for QA only? Do we want this controllable by a flag? |
I think it would be good to do this on production as well, since it will be helpful for debugging. |
* Fix #464 - add lock lifetime for all * Add HTTP timeout * Make adjustable through options * Will need to delete all locks from MongoDB - otherwise will endlessly loop for startup * Fix some ci-e2e issues Only use locking when accessing SMT model Fix unit tests Update to latest version of Machine Fix bug where wrong id is used when starting a build Remove reference to Serval.Shared in Serval.Machine.Shared * preserve fix fro #468
We still need to implement a method for cleaning up the files. |
I know you were asking about an approach in the PR above, @ddaspit: I'm thinking I'll just add a |
The longest that I could imagine that a job could take would be 1 day. If we set the expiration for 30 days and key off of |
I'm not sure what you mean by 'set the expiration', @johnml1135. Are you referring to an s3 object expiration? |
Set the expiration -> "delete the build artifacts after a certain amount of time." Sorry for the bad English. |
No, no, I just wanted to make sure I understood 😁 . So you're suggesting we use a property of the files themselves? |
Yes. Any system should at least have a created on and last modified on date. |
The files are stored in the S3 bucket. Is that correct? Does S3 have a date modified/created field? |
OK, so basically just the equivalent of the |
A script like that could be run using a cron job. I think that is how Matthew is doing it. |
Right. Might it be appropriate just to hijack that script and tag on an additional pattern that covers the Serval builds? |
That makes sense to me. Although, we are planning on moving silnlp data to the NAS, so production and research data will be stored in separate locations. |
I am assuming that the job would be run directly within serval as a recurring task. If it were an actual cron job, the deployment would be non-obvious. |
If possible, I would prefer to run this as a cron job rather than use up a thread in the Serval server. Does Kubernetes have some way to run scheduled jobs? |
Another option is just to have it completely separate from Serval - running the cron job on with the SILNLP cron job for the S3 bucket data. I would make sure that there was a flag to turn on and off the serval auto-deleting at the end of jobs though. It would be a bit manual, but really only effects S3 bucket cost if the cron job does not execute. |
What's the conclusion here in regards to a strategy? |
Let's talk about it at Wednesday's meeting. I am still inclined to do the "roll it into SILNLP cleanup" thing. |
@mshannon-sil - If I am correct, there is an SILNLP job that runs that cleans up old data. Is that correct? |
Yes that's right, the clean_s3 script in the SILNLP repo currently runs as a cron job on the AQuA server every Sunday. |
@mshannon-sil - how hard would it be to extend that script to also clean production data? |
Should be pretty straightforward. It would just need to check if the files are in the production folder and match the pattern for corpus and pretranslation files, and then delete any files older than 30 days. If we just want to update the clean_s3 script for this, I'd be happy to do it. |
@mshannon-sil, what is the status of this issue? |
The status is ready to be worked on. I have some time today, so I'll work on it and keep you updated. |
@johnml1135 I just submitted a PR for this issue in SILNLP. |
@mshannon-sil - can you respond to the comments? |
Yes sorry for the delay, I was tackling other assignments when I came back last week and this got pushed back. I have time today to review. |
The PR has been merged, and I just verified that the cron job is deleting 2 month old production builds in addition to the 1 month old research checkpoints. |
Currently, Serval deletes corpus and pretranslation files once a build has finished. This makes it difficult to debug issues and to perform testing. Instead, Serval should retain the files after a build has finished. The files should be deleted after a predetermined amount of time (maybe 30 days).
The text was updated successfully, but these errors were encountered: