diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index d103dcd247..5672595281 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -1,7 +1,8 @@ # exp run -Run or resume a -[DVC Experiment](/doc/user-guide/experiment-management/experiments-overview). +Run or resume a [DVC experiment]. + +[dvc experiment]: /doc/user-guide/experiment-management/experiments-overview ## Synopsis @@ -38,9 +39,8 @@ Use the `--set-param` (`-S`) option as a shortcut to change parameter values [on-the-fly] before running the experiment. It's possible to [queue experiments] for later execution with the `--queue` -flag. Queued experiments can be run using `dvc queue start`, refer to the -`dvc queue` documentation for more information on managing the experiment task -queue. +flag. Queued experiments can be run with `dvc queue start` and managed with +other `dvc queue` commands. It's also possible to run special [checkpoint experiments] that log the execution progress (useful for deep learning ML). The `--rev` and `--reset` @@ -102,10 +102,6 @@ committing them to the Git repo. Unnecessary ones can be [cleared] with workspace (in `.dvc/tmp/exps`). Use `-j` to execute them [in parallel](#queueing-and-parallel-execution). -- `-j `, `--jobs ` - run this `number` of queued experiments in - parallel. Only has an effect along with `--run-all`. Defaults to 1 (the queue - is processed serially). - `dvc exp run --run-all [--jobs]` is now a shortcut for @@ -114,6 +110,10 @@ committing them to the Git repo. Unnecessary ones can be [cleared] with +- `-j `, `--jobs ` - run this `number` of queued experiments in + parallel. Only has an effect along with `--run-all`. Defaults to 1 (the queue + is processed serially). + - `-r `, `--rev ` - resume an experiment from a specific checkpoint name or hash (`commit`) in `--queue` or `--temp` runs. diff --git a/content/docs/command-reference/queue/index.md b/content/docs/command-reference/queue/index.md index 27e3e7624a..25748c8dea 100644 --- a/content/docs/command-reference/queue/index.md +++ b/content/docs/command-reference/queue/index.md @@ -1,14 +1,15 @@ # queue -A set of commands to manage the -[DVC experiments](/doc/user-guide/experiment-management/experiments-overview) -task queue: [start](/doc/command-reference/queue/start), +A set of commands to manage the [DVC experiments] task queue: +[start](/doc/command-reference/queue/start), [stop](/doc/command-reference/queue/stop), [status](/doc/command-reference/queue/status), [logs](/doc/command-reference/queue/logs), [remove](/doc/command-reference/queue/remove), [kill](/doc/command-reference/queue/kill) +[dvc experiments]: /doc/user-guide/experiment-management/experiments-overview + ## Synopsis ```usage @@ -27,8 +28,13 @@ positional arguments: ## Description -`dvc queue` subcommands provide specialized ways to manage queued experiment -tasks. +You can use `dvc exp run --queue` to queue ML experiments. `dvc queue` provides +an interface to process and manage these queued tasks. + +📖 See [this guide] for more information on experiment queues. + +[this guide]: + /doc/user-guide/experiment-management/running-experiments#the-experiments-queue ## Options diff --git a/content/docs/command-reference/queue/kill.md b/content/docs/command-reference/queue/kill.md index f2fa0758cc..0f625fddf9 100644 --- a/content/docs/command-reference/queue/kill.md +++ b/content/docs/command-reference/queue/kill.md @@ -1,8 +1,8 @@ ## queue kill -Kill actively running -[DVC Experiment](/doc/user-guide/experiment-management/experiments-overview) -tasks. +Kill actively running [DVC experiment] tasks (see `dvc queue start`). + +[dvc experiments]: /doc/user-guide/experiment-management/experiments-overview ## Synopsis @@ -15,23 +15,22 @@ positional arguments: ## Description -Forcefully stops execution of the specified (running) experiment tasks. Killed -tasks will be considered as failed runs. - -This command does not stop the queue worker process. After the specified task -has been killed, the worker process will consume and execute the next experiment -task in the queue. - -To kill all running experiment tasks and also stop queue processing, you can use -`dvc queue stop --kill`. +Forcefully stops execution of the specified (running) experiment tasks. -Note that killed experiment tasks will be considered failed runs and will not be +Note that killed experiments will be considered failed runs and will not be re-added to the queue for future execution. +This command does not stop the `dvc queue start` worker(s). After the specified +task has been killed, a worker will move on to process the next experiment task +in the queue. + +To kill all running experiments and also stop processing the queue, use +`dvc queue stop --kill`. + ## Options - `-h`, `--help` - prints the usage/help message, and exit. diff --git a/content/docs/command-reference/queue/logs.md b/content/docs/command-reference/queue/logs.md index 83e72ab509..6b70646348 100644 --- a/content/docs/command-reference/queue/logs.md +++ b/content/docs/command-reference/queue/logs.md @@ -1,8 +1,8 @@ ## queue logs -Show output logs for running and completed tasks in the -[DVC Experiment](/doc/user-guide/experiment-management/experiments-overview) -task queue. +Show console output logs for [DVC experiment] tasks (see `dvc queue start`). + +[dvc experiment]: /doc/user-guide/experiment-management/experiments-overview ## Synopsis @@ -15,15 +15,20 @@ positional arguments: ## Description -Shows output logs for the specified running or completed experiment task. +Shows the console output logs for the specified running or completed experiment +`task`. + +By default, this command will show any existing logs and then exit. For running +tasks, the `--follow` option can be used to attach to the task and show live +logs (until the task has completed). -By default, this command will show any available log data and then exit. For -tasks which are still running, the `--follow` option can be used to attach to -the task and continuously show live log output, until the task has completed. + -When using the `--follow` option, it is safe to stop following output using -`Ctrl+C` (or `SIGINT`). This will only cause the logs command to exit, and the -experiment task will continue to be run in the background. +It is safe to interrupt the `--follow` process, with `Ctrl+C` (or `SIGINT`) for +example. This will only cause the `dvc queue logs` command to exit, but the +experiment continue to run in the background. + + ## Options @@ -47,8 +52,6 @@ experiment task will continue to be run in the background. - `-v`, `--verbose` - displays detailed tracing information. -## Examples - ## Example: View logs for completed experiment tasks Let's say we have previously run some queued experiment tasks: diff --git a/content/docs/command-reference/queue/remove.md b/content/docs/command-reference/queue/remove.md index 9e799c7ba8..6671afb534 100644 --- a/content/docs/command-reference/queue/remove.md +++ b/content/docs/command-reference/queue/remove.md @@ -1,8 +1,10 @@ ## queue remove -Remove queued and completed tasks from the -[DVC Experiment](/doc/user-guide/experiment-management/experiments-overview) -task queue. +Removes non-active tasks from the [DVC experiment] queue. + +> See `dvc queue kill` to interrupt active ones. + +[dvc experiment]: /doc/user-guide/experiment-management/experiments-overview ## Synopsis @@ -17,15 +19,14 @@ positional arguments: ## Description -Removes the specified queued or completed experiment tasks from the queue. For -completed tasks, this will also remove any associated output logs. +Removes the specified queued or completed experiment `task`(s) from the queue. +For completed tasks, this will also remove any associated output logs. Note that for successfully completed tasks, this command is not the same as -`dvc exp remove`. `dvc queue remove` does not remove any Git or DVC data -associated with a successful DVC experiment. It only removes the task queue -entry and any associated output logs for that task. +`dvc exp remove`, which does not remove any data associated with a the +experiment, only the queue entry and any output logs for that task. diff --git a/content/docs/command-reference/queue/start.md b/content/docs/command-reference/queue/start.md index da25959844..3cd081ccd4 100644 --- a/content/docs/command-reference/queue/start.md +++ b/content/docs/command-reference/queue/start.md @@ -1,8 +1,9 @@ ## queue start -Start the -[DVC experiments](/doc/user-guide/experiment-management/experiments-overview) -task queue worker process. +Start running all [queued experiments], possibly in parallel. + +[queued experiments]: + /doc/user-guide/experiment-management/running-experiments#the-experiments-queue ## Synopsis @@ -12,39 +13,34 @@ usage: dvc queue start [-h] [-q | -v] [-j ] ## Description -Starts one or more task queue worker processes. Each worker process will consume -and execute one queued experiment task at a time in the background, until either -`dvc queue stop` is used or the queue is empty. +Starts one or more workers (`--jobs`) to process experiments. Each worker will +consume and execute one queued experiment tasks at a time in the background, +until either `dvc queue stop` is used or the queue is empty. Due to [internal limitations], when the queue is empty a worker may be idle for up to 10 seconds before exiting. If new experiment tasks are added to the queue -during this time, the idle worker will resume processing them instead. +during this time, workers will resume processing them instead. [internal limitations]: /doc/user-guide/experiment-management/running-experiments#how-are-experiments-queued -Queued experiment tasks are run sequentially by default, but can be run in -parallel by using the `--jobs` option to start more than one worker. - - + -Parallel runs are experimental and may be unstable. Make sure you're using -number of jobs that your environment can handle (no more than the CPU cores). +Use `dvc queue kill` to stop specific experiments that are currently running. -Note that since queued experiments are run isolated from each other, common -stages may sometimes be executed several times depending on the state of the -run-cache at that time. +`dvc queue logs` lets you to see the console output from any experiments run in +the background with this command (for example for debugging). ## Options -- `-j `, `--` - start up to this number of workers in parallel. - Defaults to 1 (the task queue is processed serially). +- `-j `, `--jobs ` - run this `number` of queued experiments in + parallel. Defaults to 1 (the task queue is processed serially). diff --git a/content/docs/command-reference/queue/status.md b/content/docs/command-reference/queue/status.md index 37bd69282c..c1b73ae2bb 100644 --- a/content/docs/command-reference/queue/status.md +++ b/content/docs/command-reference/queue/status.md @@ -1,8 +1,8 @@ ## queue status -Show status of tasks and workers for the -[DVC Experiment](/doc/user-guide/experiment-management/experiments-overview) -task queue. +Show status of tasks and workers for the [DVC experiment] task queue. + +[dvc experiment]: /doc/user-guide/experiment-management/experiments-overview ## Synopsis @@ -12,8 +12,17 @@ usage: dvc queue status [-h] [-q | -v] ## Description -Shows status of queued and running tasks in the task queue, as well as status -for started queue worker processes. +Shows the status of queued and running experiments in the queue, as well as the +status of running workers (see `dvc queue start`). + +```dvc +$ dvc queue status +Task Name Created Status +753b005 04:01 PM Running +1ae8b65 04:01 PM Queued + +Worker status: 1 active, 0 idle +``` ## Options diff --git a/content/docs/command-reference/queue/stop.md b/content/docs/command-reference/queue/stop.md index 77981fbac7..12aee24914 100644 --- a/content/docs/command-reference/queue/stop.md +++ b/content/docs/command-reference/queue/stop.md @@ -1,8 +1,9 @@ ## queue stop -Stop the -[DVC experiments](/doc/user-guide/experiment-management/experiments-overview) -task queue worker process. +Stop running queued [DVC experiments] (see `dvc queue start`) after the current +ones are finished running. + +[dvc experiments]: /doc/user-guide/experiment-management/experiments-overview ## Synopsis @@ -12,22 +13,22 @@ usage: dvc queue stop [-h] [-q | -v] [--kill] ## Description -Stops all running task queue worker processes. Any queued experiment tasks which -have not been run will remain in the queue (to be executed the next time that -`dvc queue start` is run). +Signals DVC to stop all workers that are running queued experiments. -By default, DVC will wait for any experiment tasks which are currently running -to complete before gracefully stopping any queue workers. The `--kill` option -can be used to kill any currently running experiment tasks and stop the queue -workers immediately. +By default, DVC will wait for any experiments that are currently running to +complete before gracefully stopping workers. The `--kill` option can be used to +interrupt them instead and stop all workers immediately. -Note that killed experiment tasks will be considered failed runs and will not be +Note that killed experiments will be considered failed runs and will not be re-added to the queue for future execution. +Any queued experiment tasks which have not been processed will remain in the +queue (use `dvc queue start` again to resume processing them). + ## Options - `--kill` - kill any experiment tasks that are currently running and diff --git a/content/docs/user-guide/experiment-management/cleaning-experiments.md b/content/docs/user-guide/experiment-management/cleaning-experiments.md index e220cbc7a5..85dd6b20a8 100644 --- a/content/docs/user-guide/experiment-management/cleaning-experiments.md +++ b/content/docs/user-guide/experiment-management/cleaning-experiments.md @@ -210,9 +210,8 @@ Removed experiments: exp-bb09c ## Removing queued experiments -When you've created experiments to be run in the queue with -`dvc exp run --queue` and later decide not to run them, you can remove them with -`dvc exp remove --queue`. +When you've queued experiments with `dvc exp run --queue` and later decide not +to run them, you can remove them with `dvc exp remove --queue`. ```dvc $ dvc exp run --queue -S param=10 @@ -233,8 +232,6 @@ $ dvc exp show ───────────────────────────────────── ``` -You can delete these queued experiments with `dvc exp remove --queue`. - ```dvc $ dvc exp remove --queue $ dvc exp show diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index a3d810c484..998f1b36bc 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -1,8 +1,7 @@ # Running Experiments -We explain how to execute DVC Experiments, setting their parameters, using -multiple jobs to run them in parallel, and running them in queues, among other -details. +We explain how to execute DVC Experiments, setting their parameters, queueing +them for future execution, running them in parallel, among other details. > 📖 If this is the first time you are introduced into data science > experimentation, you may want to check the basics in @@ -138,9 +137,8 @@ $ dvc queue start -> Note that in most cases, experiment tasks will be executed in the order that -> they were added to the queue (First In, First Out), but this is not -> guaranteed. +In most cases, experiment tasks will be executed in the order that they were +added to the queue (First In, First Out), but this is not guaranteed. @@ -148,6 +146,21 @@ Their execution happens outside your workspace in temporary directories for isolation, so each experiment is derived from the workspace at the time it was queued. +Queued experiments are processed serially by default, but can be run in parallel +by using more than one `--jobs` (to `dvc queue start` more than one worker). + + + +Parallel runs (using `--jobs` > 1) are experimental and may be unstable. Make +sure you're using number of jobs that your environment can handle (no more than +the CPU cores). + +Note that since queued experiments are run isolated from each other, common +stages may be executed multiple times depending on the state of the +run-cache at that time. + + +
### How are experiments isolated? @@ -174,11 +187,11 @@ committing unwanted files into Git (e.g. once successful experiments are
-💡 To clear the experiments queue and start over, use -`dvc queue remove --queued`. + -> 📖 See the `dvc exp run` and `dvc queue` references for more options related -> to the experiments queue, such as running them in parallel with `--jobs`. +To clear the experiments queue and start over, use `dvc queue remove --queued`. + + ## Checkpoint experiments