-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volume artifacts #1227
Volume artifacts #1227
Conversation
I think the idle timeout of make lint is too short and I observe it sometimes cause timeout error in CI. |
I'm not sure I understand what value does this provide over Current:
Proposed:
If someone wants to store many files as a single zip, then they can just work with files in a temp directory and then zip it to the mounted volume. With this strategy the user has the flexibility to do what they want:
I'm all for a volume-based artifact-passing solution, but I do not see this PR adding much value over Compare it with a system like this:
This system will setup volume-based artifact passing for you and relieve the user from manually managing the volumeMounts and paths. |
I understand your solution also works. My point is if a user has to write exporting logic by himself, the usability of artifacts becomes different. In other words, he needs to change his workflow a lot if he wants to change artifacts to S3. |
Just trying to understand the difference between the proposed approach and the current approach ( Current:
Proposed:
It looks like it just moves the |
As I wrote in the PR,
We don't want to touch too many times on NFS volume. So your example above is not wrong, but it's not the best way to compare the advantage of this feature. The ideal comparison is... Current (bad way):
Current (better way):
Proposed:
|
From my perspective, the "Current (better way)" is a sweet spot. AFAIK, In the "Proposed" variant there is some implicit archiving configuration that cannot be changed. And if we allow changing it, the the implementation becomes too complex for the number of characters saved (it's already too complex). |
@Ark-kun I think it would be awesome to have implemented something like what you described, is there any other issue where this is captured? I don't think this should be lost in the comments... |
I've created the issue #1349 I'm working on implementing this. |
@fj-sanchez Is it possible to achieve it with the change of this PR and the |
Here is a draft version of my volume-based data passing rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py What does everyone think? Please comment at #1349 |
I'm closing this issue as I no longer need it and it's not desired by others too. |
I know this is not a great practice in ML on Kubernetes, but saving/loading files from/to NFS is very useful for ML researchers who highly depend on NFS in their daily tasks. To provide NFS access, we restrict users of their containers as their own UID because running containers with root can do access any files in mounted NFS volume. So I made the following changes in this PR.
/argo
directory in executorsThe reason why I just don't use mounted NFS volume in the main container is performance overhead. In legacy Deep Learning jobs, we want to interact with millions of files, but we want to avoid too many requests over network, so loading gzipped files from NFS to local and saving gzipped files from local to NFS make sense. Honestly speaking, this way of file interacting in Deep Learning jobs is still not smart and there's new method like RecordIO of MXNet, but many researchers haven't got used to it.
Do you think this artifact is useful for others too?