Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pushed experiment from git remote #6006

Closed
kwon-young opened this issue May 13, 2021 · 8 comments · Fixed by #6471
Closed

Remove pushed experiment from git remote #6006

kwon-young opened this issue May 13, 2021 · 8 comments · Fixed by #6471
Assignees
Labels
A: experiments Related to dvc exp discussion requires active participation to reach a conclusion feature request Requesting a new feature p1-important Important, aka current backlog of things to do

Comments

@kwon-young
Copy link
Contributor

Push an experiment to a git remote:

$ dvc exp push origin myexpe

Then:

$ dvc exp list origin
master:
   myexpe

Currently, only local experiment can be removed with:

$ dvc exp remove myexpe

but dvc exp list origin still shows myexpe.

I would like to also be able to remove a pushed experiment on the remote.

@pmrowla pmrowla added feature request Requesting a new feature p2-medium Medium priority, should be done, but less important labels May 14, 2021
@pmrowla
Copy link
Contributor

pmrowla commented May 14, 2021

For now you can just use git and directly remove experiment refs with git push -d in the same way that you remove remote branches

# get list of exps pushed to origin
$ git ls-remote origin "refs/exps/*"

# remove one of the listed refs
$ git push -d origin refs/exps/path/to/ref

@dberenbaum dberenbaum added A: experiments Related to dvc exp discussion requires active participation to reach a conclusion labels Jun 15, 2021
@dberenbaum
Copy link
Collaborator

@pmrowla Do you think there's anything to discuss here, or is this a straightforward feature to implement?

@pmrowla
Copy link
Contributor

pmrowla commented Jun 16, 2021

@dberenbaum it still needs a discussion of what the usage should look like

@dberenbaum
Copy link
Collaborator

I think we actually need options for both dvc exp remove --git-remote (as mentioned above) and dvc exp gc --git-remote (to clean up lots of old experiments that aren't relevant anymore).

@dberenbaum it still needs a discussion of what the usage should look like

Can you elaborate? Does the suggested syntax above help?

@dberenbaum dberenbaum added p1-important Important, aka current backlog of things to do and removed p2-medium Medium priority, should be done, but less important labels Aug 12, 2021
@pmrowla
Copy link
Contributor

pmrowla commented Aug 13, 2021

@dberenbaum My original comment was regarding whether it should be tied to exp remove/gc or whether it should look more like git's push -d workflow.

Also, I do think it's worth asking the question of "is removing remote experiments necessary"? There are reasons to disallow it - the same reasons that github always keeps all pull request refs and always keeps references to all remote git branches that have ever existed (including remote branches that users manually "delete" through either github's UI or the git CLI).

Should removing remote exps actually delete refs, or should it just hide them in the way that github "removes" git branches - they are just moved into a different namespace so they don't show up in git CLI "list remote branch" commands (so that they can be restored later if the user decides they didn't actually want to delete that branch)?

Or maybe what we really need is just a way to filter the results of exp list <git remote> in the first place, instead adding any remote remove functionality at all?

@pmrowla
Copy link
Contributor

pmrowla commented Aug 13, 2021

Also I know this came up in the CML auto-push discussion as well (since we will be generating a lot of experiments that way). Maybe the CML runner pushed exps should just go into their own namespace that can be regularly cleaned on the CML side

@dberenbaum
Copy link
Collaborator

My original comment was regarding whether it should be tied to exp remove/gc or whether it should look more like git's push -d workflow.

Also, I do think it's worth asking the question of "is removing remote experiments necessary"? There are reasons to disallow it - the same reasons that github always keeps all pull request refs and always keeps references to all remote git branches that have ever existed (including remote branches that users manually "delete" through either github's UI or the git CLI).

IMO we don't need to follow Git since dvc exp provides a lightweight tracking without the strict (but useful) constraints of version control. If I want full version control, I'll use dvc exp branch/apply and regular Git branches. Experiments are intentionally ephemeral, so removing them seems a natural and necessary feature, or else we are just duplicating existing Git behavior.

Similarly, I think extending the existing dvc commands for removing experiments seems more natural than trying to follow the git push -d syntax.

Should removing remote exps actually delete refs, or should it just hide them in the way that github "removes" git branches - they are just moved into a different namespace so they don't show up in git CLI "list remote branch" commands (so that they can be restored later if the user decides they didn't actually want to delete that branch)?

Given the ephemeral nature of experiments, I think it's fine to delete them and maybe preferable to moving them. People might accidentally push a huge experiment and want to delete the ref so they can gc the repo and clean up hanging commits. That's probably a separate issue, but deleting the refs seems acceptable to me.

Also I know this came up in the CML auto-push discussion as well (since we will be generating a lot of experiments that way). Maybe the CML runner pushed exps should just go into their own namespace that can be regularly cleaned on the CML side

Good idea. Either approach could be useful for CML, but that's not the primary driver for this feature.

We are starting to promote experiment sharing more in docs, courses, etc., and it seems like sharing experiments isn't ready to be used if we don't have some way to remove pushed experiments. Experiments should be easy to both create and delete so that users feel free to try lots of things and only keep what works, regardless of whether the experiments are local or shared.

@pmrowla
Copy link
Contributor

pmrowla commented Aug 17, 2021

regarding implementation for this, deleting remote refs is a git push -d operation, internally this needs to be done via

def push_refspec(

with src set to None

karajan1001 added a commit to karajan1001/dvc that referenced this issue Aug 23, 2021
fix iterative#6006
1. add a new argument `--git-remote` to `dvc exp remove`
2. add some tests for it
karajan1001 added a commit that referenced this issue Sep 5, 2021
* Clean up remotes's exps

fix #6006
1. add a new argument `--git-remote` to `dvc exp remove`
2. support remote a special remote exp
3. add tests for it
4. fix  #6421
5. add a test for #6421.
6. extract some functions from the `dvc pull`, `dvc push`, `dvc remove` to utils.
7. add a tests for this new util function `resolve_exp_ref`
8. add `__init__.py` to some test package for the pytest fail

Co-authored-by: Jorge Orpinel <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Dave Berenbaum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp discussion requires active participation to reach a conclusion feature request Requesting a new feature p1-important Important, aka current backlog of things to do
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants