-
Notifications
You must be signed in to change notification settings - Fork 394
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Revert "guide: undo starting How To subsection"
This reverts commit e6d5f78. Extracted from #1581 (comment)
- Loading branch information
1 parent
5a3b355
commit 1fd3f9f
Showing
3 changed files
with
49 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# Best Practices for DVC Projects | ||
|
||
Data scientists, engineers, or managers may already know or can easily find | ||
answers to some of these questions. However, the variety of answers and | ||
approaches makes data science collaboration a nightmare. **A systematic approach | ||
is required.** | ||
|
||
## Questions on... | ||
|
||
### Source code and data versioning | ||
|
||
- How do you avoid discrepancies between | ||
[revisions](https://git-scm.com/docs/revisions) of source code and versions of | ||
data files, when the data cannot fit into a traditional repository? | ||
|
||
### Experiment time log | ||
|
||
- How do you track which of your | ||
[hyperparameter](<https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)>) | ||
changes contributed the most to producing or improving your target | ||
[metric](/doc/command-reference/metrics)? How do you monitor the degree of | ||
each change? | ||
|
||
### Navigating through experiments | ||
|
||
- How do you recover a model from last week without wasting time waiting for the | ||
model to retrain? | ||
|
||
- How do you quickly switch between a large dataset and a small subset without | ||
modifying source code? | ||
|
||
### Reproducibility | ||
|
||
- How do you run a model's evaluation process again without retraining the model | ||
and preprocessing a raw dataset? | ||
|
||
### Managing and sharing large data files | ||
|
||
- How do you share models trained in a GPU environment with colleagues who don't | ||
have access to a GPU? | ||
|
||
- How do you share the entire 147 GB of your ML project, with all of its data | ||
sources, intermediate data files, and models? |
File renamed without changes.