diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index bcc33aceb7..12d12a9e67 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -123,6 +123,10 @@
"slug": "data-management",
"source": "data-management/index.md",
"children": [
+ {
+ "label": "Track & Sync Versioned Data",
+ "slug": "track-sync-data"
+ },
"large-dataset-optimization",
"remote-storage",
"cloud-versioning",
diff --git a/content/docs/start/data-management/data-versioning.md b/content/docs/start/data-management/data-versioning.md
index 687b97496e..ad92aaae92 100644
--- a/content/docs/start/data-management/data-versioning.md
+++ b/content/docs/start/data-management/data-versioning.md
@@ -172,6 +172,9 @@ set up earlier. The remote storage directory should look like this:
└── a1a2931c8370d3aeedd7183606fd7f
```
+Learn more about
+[storage synchronization](/doc/user-guide/data-management/track-sync-data#synchronizing-data).
+
## Retrieving
diff --git a/content/docs/user-guide/data-management/cloud-versioning.md b/content/docs/user-guide/data-management/cloud-versioning.md
index 8278ea3b3d..ed46aea9a7 100644
--- a/content/docs/user-guide/data-management/cloud-versioning.md
+++ b/content/docs/user-guide/data-management/cloud-versioning.md
@@ -30,19 +30,22 @@ benefits of content-addressable storage.
### Expand for more details on the differences between cloud versioned and content-addressable storage
-`dvc remote` storage normally uses
-[content-addressable storage](/doc/user-guide/project-structure/internal-files#structure-of-the-cache-directory)
-to organize versioned data. Different versions of files are stored in the remote
-according to hash of their data content instead of according to their original
-filenames and directory location. This allows DVC to optimize certain remote
-storage lookup and data sync operations, and provides data de-duplication at the
-file level. However, this comes with the drawback of losing human-readable
-filenames without the use of the DVC CLI (`dvc get --show-url`) or API
-(`dvc.api.get_url()`).
+`dvc remote` storage normally uses [content-addressable storage] to organize
+versioned data. Different versions of files are stored in the remote according
+to a hash of their data contents instead of using their original filenames and
+directory location. This allows DVC to optimize certain remote storage lookup
+and [data sync operations], and provides data de-duplication at the file level.
+However, this comes with the drawback of losing human-readable filenames without
+the use of the DVC CLI (`dvc get --show-url`) or API (`dvc.api.get_url()`).
When using cloud versioning, DVC does not provide de-duplication, and certain
remote storage performance optimizations will be unavailable.
+[content-addressable storage]:
+ /doc/user-guide/project-structure/internal-files#structure-of-the-cache-directory
+[data sync operations]:
+ /doc/user-guide/data-management/track-sync-data#synchronizing-data
+
## Supported storage providers
diff --git a/content/docs/user-guide/data-management/index.md b/content/docs/user-guide/data-management/index.md
index f0fc87c7d3..1ca18c83a2 100644
--- a/content/docs/user-guide/data-management/index.md
+++ b/content/docs/user-guide/data-management/index.md
@@ -158,10 +158,10 @@ At the same time, it comes with many benefits:
- Your repository stays small and easy **collaborate** on (using
regular [Git workflows]).
- [Data versioning] guarantees ML **reproducibility**.
-- Use a **consistent interface** to access and sync data anywhere (via [CLI],
+- Use a **consistent interface** to access and [sync data] anywhere (via [CLI],
[API], [IDE], or [web]), regardless of the storage platform (S3, GDrive, NAS,
etc.).
-- Data **integrity** based on a Git-based storage; Data **security** through an
+- Data **integrity** based on Git-based storage; Data **security** through an
authored project history that can be audited.
- Advanced features: [Data registries], [ML pipelines], [CI/CD for ML],
[productize] your ML models, and more!
@@ -171,6 +171,7 @@ At the same time, it comes with many benefits:
[git workflows]:
https://git-scm.com/book/en/v2/Distributed-Git-Distributed-Workflows
[data versioning]: /doc/use-cases/versioning-data-and-models
+[sync data]: /doc/user-guide/data-management/track-sync-data#synchronizing-data
[cli]: /doc/command-reference
[api]: /doc/api-reference
[ide]: /doc/vs-code-extension
diff --git a/content/docs/user-guide/data-management/remote-storage.md b/content/docs/user-guide/data-management/remote-storage.md
index a4457ff5e6..d2604441e1 100644
--- a/content/docs/user-guide/data-management/remote-storage.md
+++ b/content/docs/user-guide/data-management/remote-storage.md
@@ -20,11 +20,14 @@ wide variety of [storage types](#supported-storage-types).
The main uses of remote storage are:
-- Synchronize DVC-tracked data (previously cached).
+- [Synchronize] DVC-tracked data (previously cached).
- Centralize or distribute large file storage for sharing and collaboration.
- Back up different versions of your data and models.
- Save space in your working environment (by deleting pushed files/directories).
+[synchronize]:
+ /doc/user-guide/data-management/track-sync-data#synchronizing-data
+
## Configuration
You can set up one or more remote storage locations, mainly with the
diff --git a/content/docs/user-guide/data-management/track-sync-data.md b/content/docs/user-guide/data-management/track-sync-data.md
new file mode 100644
index 0000000000..a4c9e9361c
--- /dev/null
+++ b/content/docs/user-guide/data-management/track-sync-data.md
@@ -0,0 +1,164 @@
+# Track and Sync Versioned Data & Models
+
+The fundamental workflow of most DVC projects includes the
+following **basic operations**. These can be performed directly (as we cover
+here) but are sometimes included automatically in advanced workflows, like
+[pipelining] and [experiment management].
+
+[pipelining]: /doc/user-guide/pipelines
+[experiment management]: /doc/user-guide/experiment-management
+
+## Tracking data
+
+DVC is [similar to Git] here. To start tracking large files or directories (e.g.
+data or machine learning models), "add" them to DVC with the `dvc add` command.
+This caches the files and [links them] back to the
+workspace (hiding them from Git). A matching `.dvc` file is
+created.
+
+To capture changes to tracked data, `dvc add` them again (`dvc commit` will also
+do the trick). This caches the latest file contents and updates `.dvc` metafiles
+accordingly.
+
+[similar to git]:
+ https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository
+[links them]: /doc/user-guide/data-management/large-dataset-optimization
+
+
+
+`.dvc` and other [metafiles] can be tracked (and [versioned](#versioning-data))
+with Git.
+
+[metafiles]: /doc/user-guide/project-structure
+
+
+
+If you need to move or rename tracked data, use `dvc move`. To stop tracking it,
+use `dvc remove`. To also remove it from the cache, use `dvc gc`. See [more
+details].
+
+To wrap up, you can get an overview of DVC-tracked assets with
+`dvc data status`. This will list changes to tracked files and directories as
+well as files unknown to DVC (or Git):
+
+```cli
+$ dvc data status
+Not in cache:
+ tmp/
+
+DVC committed changes:
+ added: data.xml
+ modified: data/features/
+
+DVC uncommitted changes:
+ deleted: model.pkl
+```
+
+[more details]: /doc/user-guide/how-to/stop-tracking-data
+
+
+
+Other related commands: `dvc status`, `dvc list`, `dvc import`,
+`dvc import-url`, `dvc unprotect`.
+
+
+
+## Synchronizing data
+
+DVC lets you [codify your data][data versioning] and ML models, configure the
+project's storage location(s), and stop worrying about low-level file operations
+like copying, moving, renaming, uploading, etc.
+
+At a minimum, you'll have one data store: the project's cache.
+[Data-tracking](#tracking-data) operations already keep it in sync with your
+workspace most of the time.
+
+
+
+`dvc commit` and `dvc checkout` let you force-sync them if needed, for example
+if unexpected errors occur (e.g. cache corruption).
+
+
+
+[data versioning]: /doc/use-cases/versioning-data-and-models
+
+To add storage locations to share and back up your work, you can configure [DVC
+remotes] using `dvc remote` commands (more on their [configuration]). Once this
+is done, use `dvc push` and `dvc pull` (among others) to transfer data between
+the project and remote storage.
+
+[dvc remotes]: /doc/user-guide/data-management/remote-storage
+[configuration]: /doc/user-guide/data-management/remote-storage#configuration
+
+![Sync ops among locations](/img/sync-ops-locations.png) _Data sync operations
+among locations_
+
+
+
+`dvc fetch` transfers files downstream halfway -- from remote storage to the
+cache. This can be useful to make sure that some data is available
+for checkout later.
+
+
+
+A more advanced strategy is to access and synchronize data assets directly from
+misc. locations or other DVC projects (e.g. [data registry] pattern). See
+`dvc list`, `dvc import`/`dvc import-url`, and `dvc update`, as well as the
+[Python API].
+
+[protected]: /doc/command-reference/unprotect
+[data registry]: /doc/use-cases/data-registry
+[python api]: /doc/api-reference
+
+## Versioning data
+
+Many `dvc` commands give out hints about `git` commands to follow then with.
+This helps you complete the [data versioning] side of the operation (if needed).
+
+![Versioning flow](/img/flow.png) _DVC metafiles represent your data and models
+in the Git repo, while large files are stored in the cache (and/or remote
+storage) and linked to your workspace._
+
+Some common sequences:
+
+- Check the `dvc data status` (or `dvc status`) before deciding what changes to
+ track with Git.
+- `dvc add` (or `dvc commit`) your data and then `git add` and `git commit` the
+ resulting DVC metafiles. This registers DVC-tracked files with Git indirectly
+ (without storing them in the Git repo).
+- After you `git push` project versions associated with new or changed data, you
+ may want to `dvc push` those data updates to a [DVC remote][dvc remotes].
+- `git checkout` to switch project versions (commits, branches, etc.) and then
+ `dvc checkout` to get the corresponding large files tracked by DVC into your
+ workspace.
+- `git clone` or `git pull` a DVC repository (e.g. to get others contributions),
+ and then `dvc pull` the matching data files.
+
+
+
+Some of these are so common that DVC provides the `dvc install` helper command
+to set up [certain Git hooks] that automate them.
+
+[certain git hooks]: /doc/command-reference/install#installed-git-hooks
+
+
+
+Managing multiple versions of data or models (including their training
+parameters and performance metrics) with Git is great, but sometimes requires
+navigation aids. DVC provides comparison commands like `dvc diff` (similar to
+`git diff`) to help with this. See also `dvc params diff`, `dvc metrics diff`,
+and `dvc plots diff`.
+
+
+
+Another neat feature of some DVC commands is the `--rev` ([revision]) option.
+This lets you specify a version of the project to operate from. For example,
+`dvc import --rev a17b8fd` can import data associated with the source project
+commit `a17b8fd`. Other commands with `--rev`: `dvc gc`, `dvc list`, etc.
+
+
+
+[git branches]:
+ https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging
+[tags]: https://git-scm.com/book/en/v2/Git-Basics-Tagging
+[revision]: https://git-scm.com/docs/revisions
diff --git a/content/docs/user-guide/experiment-management/sharing-experiments.md b/content/docs/user-guide/experiment-management/sharing-experiments.md
index edbaf27cad..a71b743138 100644
--- a/content/docs/user-guide/experiment-management/sharing-experiments.md
+++ b/content/docs/user-guide/experiment-management/sharing-experiments.md
@@ -4,7 +4,7 @@ In a regular Git workflow, DVC repository versions are typically
synchronized among team members. And [DVC Experiments] are internally connected
to this commit history, so you can similarly share them.
-## Basic workflow: store as peristent commits
+## Basic workflow: store as persistent commits
The most straightforward way to share experiments is to store them as
[persistent](/doc/user-guide/experiment-management/persisting-experiments) Git
diff --git a/static/img/sync-ops-locations.png b/static/img/sync-ops-locations.png
new file mode 100644
index 0000000000..c8573f36e2
Binary files /dev/null and b/static/img/sync-ops-locations.png differ