option to have mirrors downloading for external sources #6666
Replies: 3 comments
-
Our There is probably a better way to organize and formalize this, maybe even with existing functionality, but I can't think of one right now. |
Beta Was this translation helpful? Give feedback.
-
I do not think I should ask users of my opensource libraries to do that, the fewer movements I ask them to do, the better it is. Better just to be able to configure it, for instance in dvc.lock. If I have a huge file that I want to be downloaded from the source instead of dvc remote (that is very common usecase, we have external files ranging from 0.5 to 20 GBs) why not just having a flag/option in dvc.lock? |
Beta Was this translation helpful? Give feedback.
-
@antonkulaga Good point. How do you see that flag/option, btw? |
Beta Was this translation helpful? Give feedback.
-
For public datasets it is usually way faster to download from public sources (especially for well known ones, like image databases or reference genomes) than from remotes that you have on your server. It is especially important for opensource projects where you have a trade-off wether to pay for the trafic for many gygabytes of downloads or to use free but very slow services like GoogleDrive-s that also sometimes scare users by app permissions.
What I suggest is to allow downloading from public sources instead (or in parallel with) of your remotes when sources are availible, so if you have external imports in your project it will check if they are still avaliable in original source, if yes - then it downloads from them, if not - from your remote.
Beta Was this translation helpful? Give feedback.
All reactions