-
pdoc: Automatically create an API documentation for your project
-
pre-commit plugins: Automate code reviewing formatting
.
├── config
│ ├── main.yaml # Main configuration file
│ ├── model # Configurations for training model
│ │ ├── model1.yaml # First variation of parameters to train model
│ │ └── model2.yaml # Second variation of parameters to train model
│ └── process # Configurations for processing data
│ ├── process1.yaml # First variation of parameters to process data
│ └── process2.yaml # Second variation of parameters to process data
├── data
│ ├── final # data after training the model
│ ├── processed # data after processing
│ └── raw # raw data
├── docs # documentation for your project
├── .gitignore # ignore files that cannot commit to Git
├── Makefile # store useful commands to set up the environment
├── models # store models
├── notebooks # store notebooks
├── .pre-commit-config.yaml # configurations for pre-commit
├── pyproject.toml # dependencies for poetry
├── README.md # describe your project
├── src # store source code
│ ├── __init__.py # make src a Python module
│ ├── process.py # process data before training model
│ ├── train_model.py # train model
│ └── utils.py # store helper functions
└── tests # store tests
├── __init__.py # make tests a Python module
├── test_process.py # test functions for process.py
└── test_train_model.py # test functions for train_model.py
- Install Poetry
- Activate the virtual environment:
poetry shell
- Install dependencies:
- To install all dependencies from pyproject.toml, run:
poetry install
- To install only production dependencies, run:
poetry install --only main
- To install a new package, run:
poetry add <package-name>
To view the configurations associated with a Pythons script, run the following command:
python src/process.py --help
Output:
process is powered by Hydra.
== Configuration groups ==
Compose your configuration from those groups (group=option)
model: model1, model2
process: process1, process2
== Config ==
Override anything in the config (foo.bar=value)
process:
use_columns:
- col1
- col2
model:
name: model1
data:
raw: data/raw/sample.csv
processed: data/processed/processed.csv
final: data/final/final.csv
To alter the configurations associated with a Python script from the command line, run the following:
python src/process.py data.raw=sample2.csv
To auto-generate API document for your project, run:
make docs