Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push running checkpoint to remote #6332

Merged
merged 24 commits into from
Aug 12, 2021
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
14175c8
Push running checkpoint to remote
karajan1001 Jul 19, 2021
770e5bc
Update dvc/repo/experiments/executor/base.py
karajan1001 Jul 20, 2021
15036f3
Get the branch name
karajan1001 Jul 21, 2021
205ce57
Finish this PR
karajan1001 Jul 26, 2021
b6f147a
Update after finally commit
karajan1001 Jul 27, 2021
65eb8d6
Some problems found in review
karajan1001 Jul 27, 2021
5077862
Update dvc/repo/experiments/executor/base.py
karajan1001 Jul 27, 2021
dd7b380
add remote check before auto push
karajan1001 Aug 2, 2021
92d9b07
Split current one env into two
karajan1001 Aug 4, 2021
7c271b3
Rename DVC_EXP_CHECKPOINT_PUSH to DVC_EXP_AUTO_PUSH
karajan1001 Aug 5, 2021
34f4da3
value error
karajan1001 Aug 5, 2021
37d2150
Remove getenv_bool switch to env2bool
karajan1001 Aug 6, 2021
0cfd8d1
Better on handling dulwich exceptions
karajan1001 Aug 6, 2021
ce7cd7d
Name changing removing prints
karajan1001 Aug 6, 2021
e666d2e
Update dvc/env.py
karajan1001 Aug 9, 2021
50d419f
Some changes in review
karajan1001 Aug 9, 2021
5cf541c
Add validation to repo url
karajan1001 Aug 10, 2021
dfcc946
Add new type of SCMError
karajan1001 Aug 10, 2021
060ec01
Update dvc/repo/experiments/executor/base.py
karajan1001 Aug 10, 2021
ffb657a
Use spy tests call counts and args
karajan1001 Aug 10, 2021
41aeaaf
Update dvc/scm/git/backend/base.py
karajan1001 Aug 11, 2021
488c7f3
Update dvc/scm/git/backend/dulwich.py
karajan1001 Aug 11, 2021
4a16226
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 11, 2021
32677a2
Remove validate_git_repo to validate_git_remote
karajan1001 Aug 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dvc/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@
DVCLIVE_HTML = "DVCLIVE_HTML"
DVCLIVE_RESUME = "DVCLIVE_RESUME"
DVC_IGNORE_ISATTY = "DVC_IGNORE_ISATTY"
DVC_EXP_GIT_REMOTE = "DVC_EXP_GIT_REMOTE"
DVC_EXP_AUTO_PUSH = "DVC_EXP_CHECKPOINT_PUSH"
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved
48 changes: 46 additions & 2 deletions dvc/repo/experiments/executor/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@

from funcy import cached_property

from dvc.env import DVC_EXP_AUTO_PUSH, DVC_EXP_GIT_REMOTE
from dvc.exceptions import DvcException
from dvc.path_info import PathInfo
from dvc.repo import Repo
from dvc.repo.experiments.base import (
EXEC_BASELINE,
EXEC_BRANCH,
Expand All @@ -36,7 +38,7 @@
from dvc.stage import PipelineStage
from dvc.stage.monitor import CheckpointKilledError
from dvc.stage.serialize import to_lockfile
from dvc.utils import dict_sha256
from dvc.utils import dict_sha256, getenv_bool
from dvc.utils.fs import remove

if TYPE_CHECKING:
Expand Down Expand Up @@ -248,6 +250,28 @@ def on_diverged_ref(orig_ref: str, new_rev: str):
)
return refs

@classmethod
def _auto_push_check(cls, dvc: "Repo"):
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved
git_remote = os.getenv(DVC_EXP_GIT_REMOTE, None)
if git_remote == dvc.root_dir:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have this explicit check? Would this qualify as a valid git remote? Is there some reason to think a user might set DVC_EXP_GIT_REMOTE to their local project's root dir?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why fail for an invalid Git remote but only warn for this scenario?

Copy link
Contributor

@pmrowla pmrowla Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git_remote is just a variable that is either a path/URL to a git repo (including local ones) or a remote name. If this check is hit it means it's a local path that points directly to the current DVC root directory, which in most cases is a valid git repo path.

This behavior can be useful - if you do DVC_EXP_GIT_REMOTE=. the end result would be that we auto-push DVC cache for your local experiment runs. (The git push step becomes a no-op, the same thing that happens if you do a CLI git push to your own current repo directory)

logger.warning(
f"'{git_remote}' points to the current Git repo, experiment "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

                f"'{git_remote}' points to the current Git repo, experiment "

I'm not sure this is helpful since a valid remote would also point to the current Git repo, right? Maybe something like "local workspace" makes more sense than "current Git repo"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local workspace is your current Git repo. https://github.com/my/repo.git is not your current Git repo - it's a remote Git repo, and potentially where your current Git repo is cloned from, but "current repo" would always be your current local workspace repo

"Git refs will not be pushed. But DVC cache and run cache "
"will automatically be pushed to the default DVC remote "
"(if any) on each experiment commit."
)
try:
for ref in dvc.scm.iter_remote_refs(
git_remote, base=EXPS_NAMESPACE
):
if ref:
break
except BaseException as e:
print(e)
raise e
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved

dvc.cloud.get_remote_odb()

@classmethod
def reproduce(
cls,
Expand Down Expand Up @@ -292,6 +316,9 @@ def filter_pipeline(stages):
log_errors,
**kwargs,
) as dvc:
if getenv_bool(DVC_EXP_AUTO_PUSH):
cls._auto_push_check(dvc)

args, kwargs = cls._repro_args(dvc)
if args:
targets: Optional[Union[list, str]] = args[0]
Expand Down Expand Up @@ -331,6 +358,7 @@ def filter_pipeline(stages):

checkpoint_func = partial(
cls.checkpoint_callback,
dvc,
dvc.scm,
name,
repro_force or checkpoint_reset,
Expand Down Expand Up @@ -361,6 +389,8 @@ def filter_pipeline(stages):
force=repro_force,
checkpoint=is_checkpoint,
)
if getenv_bool(DVC_EXP_AUTO_PUSH):
cls._auto_push(dvc, dvc.scm)
except UnchangedExperimentError:
pass
ref = dvc.scm.get_ref(EXEC_BRANCH, follow=False)
Expand Down Expand Up @@ -393,7 +423,6 @@ def _repro_dvc(
git_url: Optional[str] = None,
**kwargs,
):
from dvc.repo import Repo
from dvc.utils.serialize import modify_yaml

dvc = Repo(dvc_dir)
Expand Down Expand Up @@ -450,9 +479,21 @@ def _repro_args(cls, dvc):
kwargs = {}
return args, kwargs

@staticmethod
def _auto_push(dvc: "Repo", scm: "Git"):
git_remote = os.getenv(DVC_EXP_GIT_REMOTE)
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved
branch = scm.get_ref(EXEC_BRANCH, follow=False)
dvc.experiments.push(
git_remote,
branch,
push_cache=True,
run_cache=True,
)

@classmethod
def checkpoint_callback(
cls,
dvc: "Repo",
scm: "Git",
name: Optional[str],
force: bool,
Expand All @@ -464,6 +505,9 @@ def checkpoint_callback(
exp_rev = cls.commit(
scm, exp_hash, exp_name=name, force=force, checkpoint=True
)

if getenv_bool(DVC_EXP_AUTO_PUSH):
cls._auto_push(dvc, scm)
logger.info("Checkpoint experiment iteration '%s'.", exp_rev[:7])
except UnchangedExperimentError:
pass
Expand Down
13 changes: 10 additions & 3 deletions dvc/repo/experiments/push.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from dvc.exceptions import DvcException, InvalidArgumentError
from dvc.repo import locked
from dvc.repo.experiments.base import ExpRefInfo
from dvc.repo.scm_context import scm_context

from .utils import exp_commits, exp_refs_by_name
Expand All @@ -12,7 +13,13 @@
@locked
@scm_context
def push(
repo, git_remote, exp_name, *args, force=False, push_cache=False, **kwargs
repo,
git_remote,
exp_name: str,
*args,
force=False,
push_cache=False,
**kwargs,
):
exp_ref = _get_exp_ref(repo, exp_name)

Expand All @@ -35,9 +42,9 @@ def on_diverged(refname: str, rev: str) -> bool:
_push_cache(repo, exp_ref, **kwargs)


def _get_exp_ref(repo, exp_name):
def _get_exp_ref(repo, exp_name: str) -> ExpRefInfo:
if exp_name.startswith("refs/"):
return exp_name
return ExpRefInfo.from_ref(exp_name)

exp_refs = list(exp_refs_by_name(repo.scm, exp_name))
if not exp_refs:
Expand Down
18 changes: 9 additions & 9 deletions dvc/scm/git/backend/dulwich.py
Original file line number Diff line number Diff line change
Expand Up @@ -353,20 +353,20 @@ def iter_remote_refs(self, url: str, base: Optional[str] = None):
try:
_remote, location = get_remote_repo(self.repo, url)
client, path = get_transport_and_path(location)
if base:
yield from (
os.fsdecode(ref)
for ref in client.get_refs(path)
if ref.startswith(os.fsencode(base))
)
else:
yield from (os.fsdecode(ref) for ref in client.get_refs(path))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code block moved?

client.get_refs() failing is not always going to be due to an invalid URL. get_transport_and_path does validation on whether or not the specified git remote exists, or whether or not an explicit URL is a valid git URL.

I get that you are trying to validate whether or not the URL is reachable before the experiment is run, but I don't think that is necessary at all. And even if you want to validate the URL by doing an scm.iter_refs call, you still don't need to move this code block to do it.

(and if client.get_refs() fails, it would be better to pass the actual exception for why it failed up the chain, instead of just assuming it was due to an invalid URL.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that you are trying to validate whether or not the URL is reachable before the experiment is run, but I don't think that is necessary at all.

Because the first push will happen after one iteration, stopping at that point will leave a dirty workspace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but that's not any different than any other failure that may occur during an experiment run, or during dvc repro.

The user's pipeline code itself could fail, or the dvc push could fail due to any number of reasons, whether it's because of a network error, or because the remote config is actually invalid. (We don't validate being able to connect to a DVC remote, all we do is validate that a default remote exists in the DVC config).

All of these cases will result in a "dirty" workspace, which is fine.

Copy link
Contributor Author

@karajan1001 karajan1001 Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, users' pipeline code could fail, but they will not consider it to be a problem for us. And in some cases, training 1 epoch would cost hours or even days of time. And it will be really frustrating to get an error after training for such a long time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that we are not checking for other things that can fail in DVC, after the epoch has been generated.

Just as an example, we do not verify whether or not it is possible to actually push the cache data to the default DVC remote. If I have mis-configured my DVC remote using either a URL that doesn't exist, or with the wrong authentication credentials, we will fail after we have generated a commit and try to auto-push it.

If we are going to do this kind of strict validation on the git remote, then we should also be doing it on the DVC remote. But in my opinion, both of these checks would be excessive.


I also still do not think we should be failing at all in the event that the automatic push fails. We should wrap any exceptions that occur when we try to push, and log them if it fails, but continue running the pipeline. If the user has misconfigured the git URL, they can just do a manual dvc exp push <correct_git_url> ... after execution has completed, rather than needing to re-run an entire experiment.

But this is maybe a question for @dberenbaum

(And yes, this manual push isn't possible in the CML use case, but in the CML use case the key differences are that the user has probably already tested their code locally and that the upstream git URL or remote should be coming from the CI provider itself (i.e. from the github/gitlab environment))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks the changes are outdated, so it seems like some of the specifics have changed, and I'll focus on the general user experience.

I think there was a previous question from @karajan1001 about whether to validate the url before running the experiment. This validation seems useful and I see little downside from a user perspective. Since dvc knows the experiment will be pushed, it would be helpful to avoid issues before spending time to run the experiment. I also think it makes sense to throw an error if validation fails since auto pushing is specifically for cml and similar workflows, where leaving a dirty workspace isn't helpful.

It would actually seem useful to have some basic validation of both the git and dvc remotes. As @pmrowla mentioned, in cml the DVC_EXP_GIT_REMOTE is unlikely to be a problem since it's being set automatically. However, the dvc remote is configured by the user, and it will be problematic if the auto push fails in cml.

Feel free to push back with reasons to avoid this validation or with questions of how strictly to validate.

Copy link
Contributor

@pmrowla pmrowla Aug 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a previous question from @karajan1001 about whether to validate the url before running the experiment. This validation seems useful and I see little downside from a user perspective.

There are different possible levels of "validation". IMO validating that a .git/config remote named origin exists or that the string [email protected]/my/repo.git is a valid git URL but s3://some_non_git_url is not valid is fine. But I don't think we have to bother opening a test connection to [email protected]/my/repo.git before trying to push to it.

This is the exact same level of @karajan1001 's current validation for the DVC remote. We check that a default DVC remote exists in the repo config, but we do not do anything to verify that the remote URL is reachable, or that we have the required list/write/move/delete permissions to actually push to it.

IMO this is fine. But trying to connect to each remote before we actually run anything is overkill, and in the event that it fails we can handle it at that time.

Since dvc knows the experiment will be pushed, it would be helpful to avoid issues before spending time to run the experiment. I also think it makes sense to throw an error if validation fails since auto pushing is specifically for cml and similar workflows, where leaving a dirty workspace isn't helpful.

I don't think that an automatic push errors in general should be considered a failure state for the entire workflow. A push failure is an error that the user can recover from by just re-trying dvc exp push again after their experiment has completed.

So in the event that github or aws goes down, or that a gdrive remote gets API throttled, or any other kind of possible network error occurs (including a remote being unreachable due to a user misconfiguration), we should just log that "automatic push failed with error: ..." and continue running the pipeline.

However, the dvc remote is configured by the user, and it will be problematic if the auto push fails in cml.

In this case, users should be testing their remote configurations themselves locally before they git commit things into their .dvc/config.

And in the event that they misconfigure it, we should just warn them, and continue running the experiment (for the same reasons I already mentioned)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end of the day I guess this is a product decision, and if we want to get into creating a test file and dvc pushing and then removing it, obviously we can do that. But I think it's overkill and an unnecessary extra step.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are different possible levels of "validation". IMO validating that a .git/config remote named origin exists or that the string [email protected]/my/repo.git is a valid git URL but s3://some_non_git_url is not valid is fine. But I don't think we have to bother opening a test connection to [email protected]/my/repo.git before trying to push to it.

How can we check that it's a valid git url? I agree that limiting the validation makes sense since it might not make sense to fail for random timeouts that have nothing to do with the remote configuration. Ideally, it seems like a retry might be helpful, but that's out of scope.

I don't think that an automatic push errors in general should be considered a failure state for the entire workflow. A push failure is an error that the user can recover from by just re-trying dvc exp push again after their experiment has completed.

I think you mentioned before how this isn't helpful in the cml context since you won't be able to access the runner and debug. Are you thinking that, if there's something like a temporary network issue, then the next checkpoint might succeed, so the workflow should continue? Or are you thinking of other scenarios like remote executors where it will be possible to debug?

If there is a temporary network error for a single checkpoint push in the middle of the workflow, failing could be bad for the user since they might not be watching a long-running job and might return later to unexpectedly find that the job did not run to completion. That's why I agree that limiting validation makes sense, and I also agree that auto push errors should warn rather than interrupt the entire job.

However, in the initial validation, failing fast seems like a good user experience since they can debug quickly and resubmit their job. For the initial validation, I think it makes sense to do what we can to check if everything is configured correctly and working as expected and fail if not.

Since this feature is for @iterative/cml, it would be good to get their feedback on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we check that it's a valid git url?

All of the git backends we use have calls that can do this check (to verify that either a string remote name like origin exists, or that a URL conforms to what git expects a git URL to look like)

Basically we would just have to make this a standalone scm call:

try:
_remote, location = get_remote_repo(self.repo, url)
client, path = get_transport_and_path(location)
except Exception as exc:
raise SCMError(
f"'{url}' is not a valid Git remote or URL"
) from exc

I think you mentioned before how this isn't helpful in the cml context since you won't be able to access the runner and debug. Are you thinking that, if there's something like a temporary network issue, then the next checkpoint might succeed, so the workflow should continue? Or are you thinking of other scenarios like remote executors where it will be possible to debug?

Depending on the issue, yes it's entirely possible that after one exp commit fails to be pushed in CML, the subsequent push could succeed. But really I was just thinking of scenarios where users set up automatic pushing in general, and not necessarily in CI (this becomes relevant for local exp runs in the event that support for this gets added into dvclive)

except Exception as exc:
raise SCMError(
f"'{url}' is not a valid Git remote or URL"
) from exc

if base:
yield from (
os.fsdecode(ref)
for ref in client.get_refs(path)
if ref.startswith(os.fsencode(base))
)
else:
yield from (os.fsdecode(ref) for ref in client.get_refs(path))

def get_refs_containing(self, rev: str, pattern: Optional[str] = None):
raise NotImplementedError

Expand Down
9 changes: 9 additions & 0 deletions dvc/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Helpers for other modules."""

import ast
import hashlib
import json
import logging
Expand Down Expand Up @@ -512,3 +513,11 @@ def errored_revisions(rev_data: Dict) -> List:
if nested_contains(data, "error"):
result.append(revision)
return result


def getenv_bool(name: str, default: str = "False"):
raw = os.getenv(name, default).title()
try:
return bool(ast.literal_eval(raw))
except ValueError:
return False
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved
18 changes: 18 additions & 0 deletions tests/func/experiments/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,21 @@ def checkpoint_stage(tmp_dir, scm, dvc, mocker):
scm.commit("init")
stage.iterations = DEFAULT_ITERATIONS
return stage


@pytest.fixture
def git_upstream(tmp_dir, erepo_dir):
url = "file://{}".format(erepo_dir.resolve().as_posix())
tmp_dir.scm.gitpython.repo.create_remote("upstream", url)
erepo_dir.remote = "upstream"
erepo_dir.url = url
return erepo_dir


@pytest.fixture
def git_downstream(tmp_dir, erepo_dir):
url = "file://{}".format(tmp_dir.resolve().as_posix())
erepo_dir.scm.gitpython.repo.create_remote("upstream", url)
erepo_dir.remote = "upstream"
erepo_dir.url = url
return erepo_dir
106 changes: 106 additions & 0 deletions tests/func/experiments/test_checkpoints.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
import logging

import pytest
from funcy import first

from dvc.config import NoRemoteError
from dvc.env import DVC_EXP_AUTO_PUSH, DVC_EXP_GIT_REMOTE
from dvc.exceptions import DvcException
from dvc.repo.experiments import MultipleBranchError
from dvc.repo.experiments.base import EXEC_APPLY, EXEC_CHECKPOINT
from dvc.repo.experiments.utils import exp_refs_by_rev
from dvc.scm.base import SCMError


@pytest.mark.parametrize("workspace", [True, False])
Expand Down Expand Up @@ -188,3 +194,103 @@ def test_resume_non_head_checkpoint(
)
new_head = first(results)
assert orig_branch != dvc.experiments.get_branch_by_rev(new_head)


@pytest.fixture
def clear_env(monkeypatch):
yield
monkeypatch.delenv(DVC_EXP_GIT_REMOTE, raising=False)
monkeypatch.delenv(DVC_EXP_AUTO_PUSH, raising=False)


@pytest.mark.parametrize("use_url", [True, False])
def test_auto_push_during_iterations(
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved
tmp_dir,
scm,
dvc,
checkpoint_stage,
git_upstream,
local_remote,
use_url,
monkeypatch,
clear_env,
):
# set up remote repo
remote = git_upstream.url if use_url else git_upstream.remote
git_upstream.scm.fetch_refspecs(str(tmp_dir), ["master:master"])
monkeypatch.setenv(DVC_EXP_GIT_REMOTE, remote)

# without auto push
results = dvc.experiments.run(checkpoint_stage.addressing)
exp = first(results)
ref_info = first(exp_refs_by_rev(scm, exp))
assert git_upstream.scm.get_ref(str(ref_info)) is None

# add auto push
monkeypatch.setenv(DVC_EXP_AUTO_PUSH, "true")
results = dvc.experiments.run(checkpoint_stage.addressing)
assert (tmp_dir / "foo").read_text() == "4"
casperdcl marked this conversation as resolved.
Show resolved Hide resolved
exp = first(results)
ref_info = first(exp_refs_by_rev(scm, exp))
assert git_upstream.scm.get_ref(str(ref_info)) == exp

# check the data
with git_upstream.dvc.config.edit() as conf:
conf["remote"]["local"] = local_remote.config
conf["core"]["remote"] = "local"

git_upstream.dvc.experiments.apply(ref_info.name)
git_upstream.dvc.experiments.apply(exp)
git_upstream.dvc.pull()
assert (git_upstream / "foo").read_text() == "4"

# resume the remote checkpoint
monkeypatch.delenv(DVC_EXP_AUTO_PUSH, raising=False)
with git_upstream.chdir():
git_upstream.dvc.experiments.run(checkpoint_stage.addressing)
assert (git_upstream / "foo").read_text() == "6"
karajan1001 marked this conversation as resolved.
Show resolved Hide resolved


def test_auto_push_error_url(
dvc, scm, checkpoint_stage, local_remote, monkeypatch, clear_env
):
monkeypatch.setenv(DVC_EXP_GIT_REMOTE, "true")
monkeypatch.setenv(DVC_EXP_AUTO_PUSH, "true")
with pytest.raises(SCMError):
dvc.experiments.run(checkpoint_stage.addressing, params=["foo=2"])


def test_auto_push_no_remote(
dvc, scm, checkpoint_stage, git_upstream, monkeypatch, clear_env
):
monkeypatch.setenv(DVC_EXP_GIT_REMOTE, git_upstream.url)
monkeypatch.setenv(DVC_EXP_AUTO_PUSH, "true")
with pytest.raises(NoRemoteError):
dvc.experiments.run(checkpoint_stage.addressing, params=["foo=2"])


def test_auto_push_self_remote(
tmp_dir,
dvc,
scm,
checkpoint_stage,
local_remote,
caplog,
monkeypatch,
clear_env,
):
root_dir = str(tmp_dir)
monkeypatch.setenv(DVC_EXP_GIT_REMOTE, root_dir)
monkeypatch.setenv(DVC_EXP_AUTO_PUSH, "true")
assert (
dvc.experiments.run(checkpoint_stage.addressing, params=["foo=2"])
!= {}
)

with caplog.at_level(logging.WARNING, logger="dvc.repo.experiments"):
assert (
f"'{root_dir}' points to the current Git repo, experiment "
"Git refs will not be pushed. But DVC cache and run cache will "
"automatically be pushed to the default DVC remote (if any) "
"on each experiment commit." in caplog.text
)
18 changes: 0 additions & 18 deletions tests/func/experiments/test_remote.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,6 @@
from dvc.repo.experiments.utils import exp_refs_by_rev


@pytest.fixture
def git_upstream(tmp_dir, erepo_dir):
url = f"file://{erepo_dir.resolve().as_posix()}"
tmp_dir.scm.gitpython.repo.create_remote("upstream", url)
erepo_dir.remote = "upstream"
erepo_dir.url = url
return erepo_dir


@pytest.fixture
def git_downstream(tmp_dir, erepo_dir):
url = f"file://{tmp_dir.resolve().as_posix()}"
erepo_dir.scm.gitpython.repo.create_remote("upstream", url)
erepo_dir.remote = "upstream"
erepo_dir.url = url
return erepo_dir


@pytest.mark.parametrize("use_url", [True, False])
def test_push(tmp_dir, scm, dvc, git_upstream, exp_stage, use_url):
from dvc.exceptions import InvalidArgumentError
Expand Down
Loading