diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 9cf5e1b783..fe8bb25d0d 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -83,14 +83,9 @@
"source": "user-guide/index.md",
"children": [
{
- "slug": "what-is-dvc",
"label": "What is DVC?",
- "source": "what-is-dvc/index.md",
- "children": [
- "collaboration-issues",
- "core-features",
- "related-technologies"
- ]
+ "slug": "what-is-dvc",
+ "source": "what-is-dvc.md"
},
{
"label": "DVC Files and Directories",
@@ -134,13 +129,10 @@
"slug": "running-dvc-on-windows"
},
"troubleshooting",
+ "related-technologies",
{
"label": "Anonymized Usage Analytics",
"slug": "analytics"
- },
- {
- "label": "Privacy Policy (Google APIs)",
- "slug": "privacy"
}
]
},
diff --git a/content/docs/use-cases/index.md b/content/docs/use-cases/index.md
index 7640d75cb4..879a41f5d8 100644
--- a/content/docs/use-cases/index.md
+++ b/content/docs/use-cases/index.md
@@ -1,18 +1,26 @@
# Use Cases
We provide short articles on common ML workflow or data management scenarios
-that DVC can help with or improve. These include the motivating context (usually
-extracted from real-life cases); And the approaches to solving them can combine
-several features of DVC. Use cases are not written to be run end-to-end. For
-more general, hands-on experience with DVC, we recommend following the
-[Get Started](/doc/tutorials/get-started), and/or [Tutorials](/doc/tutorials)
-first.
+that DVC can help with or improve. These include a motivation (usually from
+real-life cases), and approaches which combine several features of DVC. Use
+cases are not written to be run end-to-end like tutorials. For more general,
+hands-on experience with DVC, please see our
+[Get Started](/doc/tutorials/get-started) instead.
> We keep reviewing our docs and will include interesting scenarios that surface
> in the community. Please, [contact us](/support) if you need help or have
> suggestions!
-## Basic uses
+## Why DVC?
+
+Even with all the success we've seen today in machine learning (ML), especially
+with deep learning and its applications in business, the data science community
+still lacks good practices for organizing their projects and collaborating
+effectively. This is a critical challenge: while ML algorithms and methods are
+no longer tribal knowledge, they are still difficult to implement, reuse, and
+manage.
+
+## Basic uses of DVC
If you store and process data files or datasets to produce other data or machine
learning models, and you want to
diff --git a/content/docs/use-cases/versioning-data-and-model-files/index.md b/content/docs/use-cases/versioning-data-and-model-files/index.md
index 4bede657db..4ef8955a59 100644
--- a/content/docs/use-cases/versioning-data-and-model-files/index.md
+++ b/content/docs/use-cases/versioning-data-and-model-files/index.md
@@ -14,13 +14,13 @@ This allows easily saving and sharing data alongside code.
![](/img/model-versioning-diagram.png)
-In this basic scenario, DVC is a better replacement for `git-lfs` (see
-[Related Technologies](/doc/understanding-dvc/related-technologies)) and for
-ad-hoc scripts on top of Amazon S3 (or any other cloud) used to manage ML
-data artifacts like raw data, models, etc. Unlike `git-lfs`, DVC
-doesn't require installing a dedicated server; It can be used on-premises (e.g.
-SSH, NAS) or with any major cloud storage provider (Amazon S3, Microsoft Azure
-Blob Storage, Google Drive, Google Cloud Storage, etc).
+In this basic scenario, DVC is a better replacement for Git-LFS (see
+[Related Technologies](/doc/user-guide/related-technologies)) and for ad-hoc
+scripts on top of Amazon S3 (or any other cloud) used to manage ML data
+artifacts like raw data, models, etc. Unlike Git-LFS, DVC doesn't require
+installing a dedicated server; It can be used on-premises (e.g. SSH, NAS) or
+with any major cloud storage provider (Amazon S3, Microsoft Azure Blob Storage,
+Google Drive, Google Cloud Storage, etc).
Let's say you already have a Git repository and put a bunch of images in the
`images/` directory, and build a `model.pkl` ML model file using them.
diff --git a/content/docs/user-guide/index.md b/content/docs/user-guide/index.md
index e51926f40b..06e4116738 100644
--- a/content/docs/user-guide/index.md
+++ b/content/docs/user-guide/index.md
@@ -1,12 +1,12 @@
# User Guide
-Our guides describe the main DVC concepts and features comprehensively,
-explaining when and how to use them, as well as connections between them. These
-guides don't focus on specific scenarios, but have a general scope – like a user
-manual. Their topics range from more technical foundations, impacting more parts
-of DVC, to more advanced and specific things you can do. We also include a few
-guides related to contributing to
-[this open-source project](https://github.com/iterative/dvc).
+Our guides describe the major features and concepts of DVC comprehensively,
+explaining when and how to use them, as well as relationship between these. We
+don't focus on specific scenarios in this section, but rather on a general
+scope. The topics here range from more foundational, impacting more parts of
+DVC, to more technical and advanced things you can do. We also include a few
+misc. guides, for example related to
+[contributing to DVC](/doc/user-guide/contributing/core) itself.
Please choose from the navigation sidebar to the left, or click the `Next`
button below ↘
diff --git a/content/docs/user-guide/related-technologies.md b/content/docs/user-guide/related-technologies.md
new file mode 100644
index 0000000000..7bbb8f70f4
--- /dev/null
+++ b/content/docs/user-guide/related-technologies.md
@@ -0,0 +1,133 @@
+# Comparison with Related Technologies
+
+DVC combines a number of existing ideas into a single tool, with the goal of
+bringing best practices from software engineering into the data science field
+(refer to [What is DVC?](/doc/user-guide/what-is-dvc) for more details).
+
+## Git
+
+- DVC builds upon Git by introducing the concept of data files – large files
+ that should not be stored in a Git repository, but still need to be tracked
+ and versioned. It leverages Git's features to enable managing different
+ versions of data itself, data pipelines, and experiments.
+
+- DVC is not fundamentally bound to Git, and can work without it (except
+ versioning-related features). This also applies to Git-LFS and Git-annex,
+ below.
+
+## Git-LFS (Large File Storage)
+
+- DVC does not require special servers like Git-LFS demands. Any cloud storage
+ like S3, Google Cloud Storage, or even an SSH server can be used as a
+ [remote storage](/doc/command-reference/remote). No additional databases,
+ servers, or infrastructure are required.
+
+- DVC does not add any hooks to the Git repo by default (although they are
+ [available](/doc/command-reference/install)).
+
+- Git-LFS was not made with data science in mind, so it doesn't provide related
+ features (e.g. [pipelines](/doc/command-reference/dag),
+ [metrics](/doc/command-reference/metrics), etc.).
+
+- Github (most common Git hosting service) has a limit of 2 GB per repository.
+
+## Git-annex
+
+- DVC can use reflinks\* or hardlinks (depending on the system) instead of
+ symlinks to improve performance and the user experience.
+
+- Git-annex is a datafile-centric system whereas DVC focuses on providing a
+ workflow for machine learning and reproducible experiments. When a DVC or
+ Git-annex repository is cloned via `git clone`, data files won't be copied to
+ the local machine, as file contents are stored in separate
+ [remotes](/doc/command-reference/remote). With DVC however, `.dvc` files,
+ which provide the reproducible workflow, are always included in the Git
+ repository. Hence, they can be executed locally with minimal effort.
+
+- DVC optimizes file hash calculation.
+
+> \* **copy-on-write links or "reflinks"** are a relatively new way to link
+> files in UNIX-style file systems. Unlike hardlinks or symlinks, they support
+> transparent [copy on write](https://en.wikipedia.org/wiki/Copy-on-write). This
+> means that editing a reflinked file is always safe as all the other links to
+> the file will reflect the changes.
+
+## Git workflows/methodologies such as Gitflow
+
+- DVC enables a new experimentation methodology that integrates easily with
+ existing Git workflows. For example, a separate branch can be created for each
+ experiment, with a subsequent merge of the branch if the experiment is
+ successful.
+
+- DVC innovates by giving users the ability to easily navigate through past
+ experiments without recomputing them each time.
+
+## Workflow management systems
+
+Pipelines and dependency graphs
+([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) such as _Airflow_,
+_Luigi_, etc.
+
+- DVC is focused on data science and modeling. As a result, DVC pipelines are
+ lightweight and easy to create and modify. However, DVC lacks advanced
+ pipeline execution features like execution monitoring, error handling, and
+ recovering.
+
+- `dvc` is purely a command line tool without a graphical user interface (GUI)
+ and doesn't run any daemons or servers. Nevertheless, DVC can generate images
+ with pipeline and experiment workflow visualizations.
+
+- See also our sister project, [CML](https://cml.dev/), that helps fill some of
+ these gaps.
+
+## Experiment management software
+
+- DVC uses Git as the underlying layer for data, pipelines, an experiment
+ versioning, instead of a custom web application.
+
+- DVC doesn't need to run any services. There's no GUI as a result, but we
+ expect some GUI services will be created on top of DVC.
+
+- DVC can generate images with [experiment](/doc/start/experiments) workflow
+ visualizations.
+
+- DVC has transparent design. Its
+ [internal files and directories](/doc/user-guide/dvc-files-and-directories)
+ have a human-readable format and can be easily reused by external tools.
+
+## Build automation tools
+
+[_Make_](https://www.gnu.org/software/make/) and others.
+
+- File tracking:
+
+ - DVC tracks files based on their hash values (MD5) instead of using
+ timestamps. This helps avoid running into heavy processes like model
+ retraining when you checkout a previous version of the project (Make would
+ retrain the model).
+
+ - DVC uses file timestamps and inodes\* for optimization. This allows DVC to
+ avoid recomputing all dependency file hashes, which would be highly
+ problematic when working with large files (multiple GB).
+
+- DVC utilizes a
+ [directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph)
+ (DAG):
+
+ - The DAG or dependency graph is defined implicitly by the connections between
+ pipeline [stages](/doc/command-reference/run), based on their
+ dependencies and outputs.
+
+ - Each stage defines one node in the DAG. All DVC-files in a repository make
+ up a [pipelines](/doc/command-reference/dag) (think a single Makefile). All
+ stages (and corresponding processes) are implicitly combined through their
+ inputs and outputs, simplifying conflict resolution during merges.
+
+ - DVC stages can be written manually in an intuitive `dvc.yaml` file, or
+ generated by the helper command `dvc run`, based on a terminal command, its
+ inputs, and outputs.
+
+> \* **Inodes** are metadata file records to locate and store permissions to the
+> actual file contents. See **Linking files** in
+> [this doc](http://www.tldp.org/LDP/intro-linux/html/sect_03_03.html) for
+> technical details (Linux).
diff --git a/content/docs/user-guide/what-is-dvc.md b/content/docs/user-guide/what-is-dvc.md
new file mode 100644
index 0000000000..bddf12cc5d
--- /dev/null
+++ b/content/docs/user-guide/what-is-dvc.md
@@ -0,0 +1,48 @@
+# What Is DVC?
+
+**Data Version Control** is a new type of data versioning, workflow and
+experiment management software, that builds upon [Git](https://git-scm.com/)
+(although it can work stand-alone). DVC reduces the gap between established
+engineering tool sets and data science needs, allowing users to take advantage
+of new [features](#core-features) while reusing existing skills and intuition.
+
+![](/img/reproducibility.png) _DVC codifies data and ML experiments_
+
+Data science experiment sharing and collaboration can be done through a regular
+Git flow (commits, branching, pull requests, etc.), the same way it works for
+software engineers.
+
+## Core Features
+
+- DVC is a [free](https://github.com/iterative/dvc/blob/master/LICENSE),
+ open-source [command line](/doc/command-reference) tool.
+
+- DVC works **on top of Git repositories** and has a similar command line
+ interface and flow as Git. DVC can also work stand-alone, but without
+ versioning capabilities.
+
+- **Data versioning** is enabled by replacing large files], dataset directories,
+ ML models, etc. with small
+ [metafiles](/doc/user-guide/dvc-files-and-directories) (easy to handle with
+ Git). These placeholders point to the original data, which is decoupled from
+ source code management.
+
+- **Data storage**: On-premises or cloud storage can be used to store the
+ project's data separate from its code base. This is how data scientists can
+ transfer large datasets or share a GPU-trained model with others.
+
+- DVC makes data science projects **reproducible** by creating lightweight
+ [pipelines](/doc/command-reference/dag) using implicit dependency graphs,and
+ codifying the data and artifacts involved.
+
+- DVC is **platform agnostic**: It runs on all major operating systems (Linux,
+ MacOS, and Windows), and works independently of the programming languages
+ (Python, R, Julia, shell scripts, etc.) or ML libraries (Keras, Tensorflow,
+ PyTorch, Scipy, etc.) used in the project.
+
+- **Easy to use**: DVC is quick to [install](/doc/install) and doesn't require
+ special infrastructure, nor does it depend on APIS or external services. It's
+ a stand-alone CLI tool.
+
+ > Git servers, as well as SSH and cloud storage providers are supported,
+ > however.
diff --git a/content/docs/user-guide/what-is-dvc/collaboration-issues.md b/content/docs/user-guide/what-is-dvc/collaboration-issues.md
deleted file mode 100644
index bac66a65dc..0000000000
--- a/content/docs/user-guide/what-is-dvc/collaboration-issues.md
+++ /dev/null
@@ -1,53 +0,0 @@
-# Collaboration Issues in Data Science
-
-Even with all the success we've seen today in machine learning (ML),
-specifically deep learning and its applications in business, the data science
-community still lacks good practices for organizing their projects and
-effectively collaborating across their varied ML projects. This is a critical
-challenge: we need to evolve towards ML algorithms and methods no longer being
-tribal knowledge and making them easy to implement, reuse, and manage.
-
-To make progress, many areas of the ML experimentation process need to be
-formalized. Common questions need to be answered in an unified, principled way.
-
-## Questions
-
-### Source code and data versioning
-
-- How do you avoid discrepancies between
- [revisions](https://git-scm.com/docs/revisions) of source code and versions of
- data files, when the data cannot fit into a traditional repository?
-
-### Experiment time log
-
-- How do you track which of your
- [hyperparameter]()
- changes contributed the most to producing or improving your target
- [metric](/doc/command-reference/metrics)? How do you monitor the degree of
- each change?
-
-### Navigating through experiments
-
-- How do you recover a model from last week without wasting time waiting for the
- model to retrain?
-
-- How do you quickly switch between a large dataset and a small subset without
- modifying source code?
-
-### Reproducibility
-
-- How do you run a model's evaluation process again without retraining the model
- and preprocessing a raw dataset?
-
-### Managing and sharing large data files
-
-- How do you share models trained in a GPU environment with colleagues who don't
- have access to a GPU?
-
-- How do you share the entire 147 GB of your ML project, with all of its data
- sources, intermediate data files, and models?
-
-Some of these questions are easy to answer individually. Data scientists,
-engineers, or managers may already knows or can easily find answers to some of
-them. However, the variety of answers and approaches makes data science
-collaboration a nightmare. **A systematic approach is required.**
diff --git a/content/docs/user-guide/what-is-dvc/core-features.md b/content/docs/user-guide/what-is-dvc/core-features.md
deleted file mode 100644
index 5960a1feed..0000000000
--- a/content/docs/user-guide/what-is-dvc/core-features.md
+++ /dev/null
@@ -1,20 +0,0 @@
-# Core Features
-
-- DVC works **on top of Git repositories** and has a similar command line
- interface and Git workflow.
-
-- It makes data science projects **reproducible** by creating lightweight
- [pipelines](/doc/command-reference/dag) using implicit dependency graphs.
-
-- **Large data file versioning** works by creating special files in your Git
- repository that point to the cache, typically stored on a local
- hard drive.
-
-- DVC is **Programming language agnostic**: Python, R, Julia, shell scripts,
- etc. as well as ML library agnostic: Keras, Tensorflow, PyTorch, Scipy, etc.
-
-- It's **Open-source** and **Self-serve**: DVC is free and doesn't require any
- additional services.
-
-- DVC supports cloud storage (Amazon S3, Microsoft Azure Blob Storage, Google
- Cloud Storage, etc.) for **data sources and pre-trained model sharing**.
diff --git a/content/docs/user-guide/what-is-dvc/index.md b/content/docs/user-guide/what-is-dvc/index.md
deleted file mode 100644
index 752dbfbe77..0000000000
--- a/content/docs/user-guide/what-is-dvc/index.md
+++ /dev/null
@@ -1,68 +0,0 @@
-# What Is DVC?
-
-Today the data science community is still lacking good practices for organizing
-their projects and effectively collaborating. ML algorithms and methods are no
-longer simple tribal knowledge but are still difficult to implement, manage and
-reuse.
-
-> One of the biggest challenges in reusing, and hence the managing of ML
-> projects, is its reproducibility.
-
-Data Version Control, or DVC, is a new type of experiment management software
-built on top of Git. DVC reduces the gap between existing tools and data science
-needs, allowing users to take advantage of experiment management while reusing
-existing skills and intuition.
-
-![](/img/reproducibility.png)_DVC codifies data and ML experiments_
-
-Leveraging an underlying source code management system eliminates the need to
-use 3rd-party services. Data science experiment sharing and collaboration can be
-done through regular Git features (commit messages, merges, pull requests, etc)
-the same way it works for software engineers.
-
-DVC uses a few core concepts:
-
-- **Experiment**: Equivalent to a
- [Git revision](https://git-scm.com/docs/revisions). Each experiment (extract
- new features, change model hyperparameters, data cleaning, add a new data
- source) can be performed in a separate branch or tag. DVC allows experiments
- to be integrated into a Git repository history and never needs to recompute
- the results after a successful merge.
-
-- **Experiment state** or state: Equivalent to a Git snapshot (all committed
- files). A Git commit hash, branch or tag name, etc. can be used as a
- [reference](https://git-scm.com/book/en/v2/Git-Internals-Git-References) to an
- experiment state.
-
-- **Reproducibility**: Action to reproduce an experiment state. This action
- generates output files (or directories) based on a set of input files and
- source code. This action usually changes experiment state.
-
-- **Pipeline**: Dependency graph or series of commands to reproduce data
- processing results. The commands are connected by their inputs
- (dependencies) and outputs. Pipelines are defined by
- special [stage files](/doc/command-reference/run) (similar to
- [Makefiles](https://www.gnu.org/software/make/manual/make.html#Introduction)).
- Refer to [pipeline](/doc/command-reference/dag) for more information.
-
-- **Workflow**: Set of experiments and relationships among them. Workflow
- corresponds to the entire Git repository.
-
-- **Data files**: Cached files (for large files). Data files are stored outside
- of the Git repository on a local/shared hard drive or remote storage, but
- [DVC-files](/doc/user-guide/dvc-files-and-directories) describing that data
- are stored in Git for DVC needs (to maintain pipelines and reproducibility).
-
-- **Cache directory**: Directory with all data files on a local hard drive or in
- cloud storage, but not in the Git repository. See `dvc cache dir`.
-
-- **Cloud storage** support: available complement to the core DVC features. This
- is how a data scientist transfers large data files or shares a GPU-trained
- model with those without GPUs available.
-
-DVC streamlines large data files and binary models into a single Git environment
-and this approach will not require storing binary files in your Git repository.
-The diagram below describes all the DVC commands and relationships between a
-local cache and remote storage:
-
-![](/img/flow-large.png)_DVC data management_
diff --git a/content/docs/user-guide/what-is-dvc/related-technologies.md b/content/docs/user-guide/what-is-dvc/related-technologies.md
deleted file mode 100644
index 86049754fa..0000000000
--- a/content/docs/user-guide/what-is-dvc/related-technologies.md
+++ /dev/null
@@ -1,141 +0,0 @@
-# Comparison to Existing Technologies
-
-DVC takes a novel approach, and it may be easier to understand DVC in comparison
-to existing technologies and tools.
-
-DVC combines a number of existing ideas into a single product, with the goal of
-bringing best practices from software engineering into the data science field.
-
-## Differences with related tools
-
-### Git
-
-- DVC extends Git by introducing the concept of _data files_ – large files that
- should NOT be stored in a Git repository but still need to be tracked and
- versioned.
-
-### Workflow management tools
-
-Pipelines and dependency graphs
-([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) such as Airflow,
-Luigi, etc.
-
-- DVC is focused on data science and modeling. As a result, DVC pipelines are
- lightweight and easy to create and modify. However, DVC lacks pipeline
- execution features like execution monitoring, execution error handling, and
- recovering.
-
-- DVC is purely a command line tool without a graphical user interface (GUI) and
- doesn't run any daemons or servers. Nevertheless, DVC can generate images with
- pipeline and experiment workflow visualizations.
-
-### Experiment management software
-
-- DVC uses Git as the underlying platform for experiment tracking instead of a
- web application.
-
-- DVC doesn't need to run any services. There's no graphical user interface as a
- result, but we expect some GUI services will be created on top of DVC.
-
-- DVC has transparent design. Its
- [files and directories](/doc/user-guide/dvc-files-and-directories) (including
- the cache directory) have a human-readable format and can be
- easily reused by external tools.
-
-### Git workflows/methodologies such as Gitflow
-
-- DVC supports a new experimentation methodology that integrates easily with a
- Git workflow. A separate branch can be created for each experiment, with a
- subsequent merge of the branch if the experiment was successful.
-
-- DVC innovates by giving experimenters the ability to easily navigate through
- past experiments without recomputing them each time.
-
-### Build automation tools
-
-[Make](https://www.gnu.org/software/make/) and others.
-
-- DVC utilizes a
- [directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph)
- (DAG):
-
- - The DAG or dependency graph is defined implicitly by the connections between
- [DVC-files](/doc/user-guide/dvc-files-and-directories) (with file names
- `.dvc`), based on their dependencies and outputs.
-
- - Each DVC-file defines one node in the DAG. All DVC-files in a repository
- make up a single pipeline (think a single Makefile). All DVC-files (and
- corresponding pipeline commands) are implicitly combined through their
- inputs and outputs, simplifying conflict resolution during merges.
-
- - DVC provides a simple command – `dvc run` – to generate a DVC-file or "stage
- file" automatically, based on the provided command, dependencies, and
- outputs.
-
-- File tracking:
-
- - DVC tracks files based on their hashes (MD5) instead of file timestamps.
- This helps avoid running into heavy processes like model retraining when you
- checkout a previously trained version of a model (Make would retrain the
- model).
-
- - DVC uses file timestamps and inodes for optimization. This allows DVC to
- avoid recomputing all dependency file hashes, which would be highly
- problematic when working with large files (10 GB+).
-
-### Git-annex
-
-- DVC uses the idea of storing the content of large files (that you don't want
- to see in your Git repository) in a local key-value store and uses file
- symlinks instead of the actual files.
-
-- DVC can use reflinks\* or hardlinks (depending on the system) instead of
- symlinks to improve performance and the user experience.
-
-- DVC optimizes file hash calculation.
-
-- Git-annex is a datafile-centric system whereas DVC is focused on providing a
- workflow for machine learning and reproducible experiments. When a DVC or
- Git-annex repository is cloned via `git clone`, data files won't be copied to
- the local machine, as file contents are stored in separate
- [remotes](/doc/command-reference/remote). With DVC,
- [DVC-files](/doc/user-guide/dvc-files-and-directories), which provide the
- reproducible workflow, are always included in the Git repository. Hence, they
- can be executed locally with minimal effort.
-
-- DVC is not fundamentally bound to Git, and users have the option of using DVC
- without Git.
-
-### Git-LFS (Large File Storage)
-
-- DVC does not require special Git servers like Git-LFS demands. Any cloud
- storage like S3, GCS, or an on-premises SSH server can be used as a backend
- for datasets and models. No additional databases, servers, or infrastructure
- are required.
-
-- DVC is not fundamentally bound to Git, and users have the option of using DVC
- without Git.
-
-- DVC does not add any hooks to the Git repo by default. To checkout data files,
- the `dvc checkout` command has to be run after each `git checkout` and
- `git clone` command. It provides control for managing data and code
- separately. Hooks could be configured to make workflows simpler.
-
-- DVC attempts to use reflinks\* and has other
- [file linking options](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache).
- This way the `dvc checkout` command does not actually copy data files from
- cache to the workspace, as copying files is a heavy
- operation for large files (30 GB+).
-
-- `git-lfs` was not made with data science scenarios in mind, so it does not
- provide related features (e.g. pipelines,
- [metrics](/doc/command-reference/metrics)), and thus Github has a limit of 2
- GB per repository.
-
----
-
-> \* **copy-on-write links or "reflinks"** are a relatively new way to link
-> files in UNIX-style file systems. Unlike hardlinks or symlinks, they support
-> transparent [copy on write](https://en.wikipedia.org/wiki/Copy-on-write). This
-> means that editing a reflinked file is always safe as all the other links to
-> the file will reflect the changes.
diff --git a/static/img/flow-large.png b/static/img/flow-large.png
deleted file mode 100644
index 177108e6ec..0000000000
Binary files a/static/img/flow-large.png and /dev/null differ