Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track cloud storage usage in metadata #3634

Open
Panaetius opened this issue Oct 10, 2023 · 1 comment
Open

Track cloud storage usage in metadata #3634

Panaetius opened this issue Oct 10, 2023 · 1 comment

Comments

@Panaetius
Copy link
Member

renku run, renku workflow execute and renku datasets commands should be aware of files from mounted cloud storage being used and track them separately/with added metadata.

We don't really need to track mount commands etc. but we should be aware of where a file came from when it is used in a workflow or dataset. We can treat this similar to how we handle external files, just with a pointer of where the file came from.

Currently mounting is only supported in sessions, but the session mounts don't use the renku cli for mounting (and probably never will), so we would need some temp file where nb-service can let the renku cli know which directory is a cloud mount and what storage id it corresponds to. For ad-hoc configured storage we don't have an id and would need to remember the cloud storage url/config instead.

I'd try to come up with a design for this first, timeboxed, before implementing anything, as it could be rather complex

@Panaetius
Copy link
Member Author

Timeboxed to 3 days.

Datasets should not allow adding files from session mounted storage. Should error out and inform the user about adding directly from a storage URL

@Panaetius Panaetius moved this from Backlog to Ready in renku-python Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready
Development

No branches or pull requests

1 participant