Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVERVIEW.md #60

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 35 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,49 @@ There are 3 main ways to contribute starting with the most straightfoward option

1. **Filing Issues against Configs:** Noticing a bug that you believe is a problem with the generated config? Please file an issue for the offending config class stating clearly your versions of `hydra-core` and the library being configured (e.g. `torch`). If you believe it is actually a problem with hydra or torch, file the issue in their respective repositories.

> **NOTE:** Please include the manual tag for the project and library version in your issue title. If, for example, there is an issue with `AdamConf` for `torch1.6` which comes from `hydra-configs-torch` v1.6.1, your issue name might look like:
**[hydra-configs-torch][1.6.1] AdamConf does not instantiate**.
> **NOTE:** Please include a **manual tag** for the project and library version in your issue title. If, for example, there is an issue with `AdamConf` for `torch1.6` which comes from `hydra-configs-torch` v1.6.1, your issue name might look like the following:
**[hydra-configs-torch][1.6.1] AdamConf does not instantiate**.

2. **Example Usecase / Tutorial:** The `hydra-torch` repository not only hosts config packages like `hydra-configs-torch`,`hydra-configs-torchvision`, etc., it also aggregates examples of how to structure projects utilizing hydra and torch. The bar is high for examples that get included, but we will work together as a community to hone in on what the best practices are. Ideally, example usecases will come along with an incremental tutorial that introduces a user to the methodology being followed. If you have an interesting way to use hydra/torch, write up an example and show us in a draft PR!

3. **Maintaining Configs:** After the initial (considerable) setup effort, the goal of this repository is to be self-sustaining meaning code can be autogrenerated when APIs change. In order to contribute to a particular package like `hydra-configs-torch`, please see the [**Projects**](https://github.com/pytorch/hydra-torch/projects) tab to identify outstanding issues per project and configured library version. Before contributing at this level, please familiarize with [configen](https://github.com/facebookresearch/hydra/tree/master/tools/configen). We are actively developing this tool as well.
3. **Maintaining Configs:** After the initial (considerable) setup effort, the goal of this repository is to be self-sustaining meaning code can be autogrenerated when APIs change. In order to contribute to a particular package like `hydra-configs-torch`, please see the [**Projects**](https://github.com/pytorch/hydra-torch/projects) tab to identify outstanding issues per project and configured library version. Before contributing at this level, please familiarize with [configen](https://github.com/facebookresearch/hydra/tree/master/tools/configen). We are actively developing this tool alongside this repository.

## Linting / Formatting
Please download the formatting / linting requirements: `pip install -r requirements/dev.txt`.
Please install the pre-commit config for this environment: `pre-commit install`.


## Using Configen

We are actively developing the tool, [Configen](https://github.com/facebookresearch/hydra/tree/master/tools/configen) to automatically create config classes for Hydra. Much of the work for `hydra-torch` has helped prototype this workflow and it is still rapidly evolving.

Currently, the workflow looks like the following:

1. Ensure the most recent configen is installed from master.
- `pip install git+https://github.com/facebookresearch/hydra/#subdirectory=tools/configen`
2. Edit `configen/conf/<project-name>.yaml`, listing the module and its classes from the project library to be configured.
- e.g. in `/configen/conf/torchvision.yaml`:
```yaml
modules:
- name: torchvision.datasets.mnist # module with classes to gen for
# mnist datasets
classes: # list of classes to gen for
- MNIST
- FashionMNIST
- KMNIST
- EMNIST
- QMNIST

```
3. In the corresponding project directory, `hydra-configs-<library-name>`, run the command `configen --config-dir ../configen/conf/ --config-name <library-name>.yaml` .
4. If generation is successful, the configs should be located in:
- `/hydra-configs-<library-name>/hydra_configs/path/to/module`.
- for our above example, you should see `MNISTConf, ..., QMNISTConf` in the module, `mnist.py` located in:
`hydra-configs-torchvision/hydra_configs/torchvision/datasets`

>**Note:** This process is still under development. As we encounter blocking factors, we do the appropriate development on Configen to mitigate future issues.


## Pull Requests
We actively welcome your pull requests.

Expand Down
100 changes: 100 additions & 0 deletions OVERVIEW.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Overview

**`hydra-torch` audience:** Hydra users utilizing PyTorch and related libraries (torchvision, ...).

**`hydra-torch` goals:**

1. Provide a maintained and tested implementation of config classes that can be used to instantiate and configure classes belonging to the supported projects.
2. Provide examples and tutorials demonstrating best practices for using Hydra to configure PyTorch.
3. Showcase a recommended approach for creating other similar configuration packages for other libraries.

#### Terminology for this overview:
- The overall effort and repo is named:`hydra-torch`
- **Library:** The library being configured.
- e.g. `torch`, `torchvision`
- **Project:** The project corresponding to the library being configured.
- e.g. `hydra-configs-torch`, `hydra-configs-torchvision`
- **Package:** The installable package containing the configs. Usually corresponding to a specific project.
- here, also `hydra-configs-torch`, `hydra-configs-torchvision`. Find the package definition in the project dir's `setup.py`.



### Repo Structure
```
📂 hydra-torch
└ 📁 configen # source files used when generating package content. all projects have their own subdirectory here.
└ 📁 hydra-configs-torch # a project corresponding to configuring a specific library (contains package definition)
└ 📁 hydra-configs-torchvision # "
└ 📁 hydra-configs-<future-library> # "
└ 📁 examples # use-cases and tutorials
```

Each `hydra-configs-<library-name>` defines its own package corresponding to a project it provides classes for. That is, `hydra-configs-torch` contains a package (with its own `setup.py`) which provides classes for `torch`.


### Packaging

This repo maintains multiple packages. An important area of consideration for these packages is to organize them such that they are compatible with each other and have an 'expected', intuitive structure. The easiest way to do this is to enforce a standard for all Hydra configs packages to follow.

Our approach makes use of [Native Namespace Packages](https://packaging.python.org/guides/packaging-namespace-packages/#native-namespace-packages). This allows all packages to share the top level namespace `hydra_configs`. The only real requirement for this package format is to ensure there is **no** `__init__.py` located in the namespace folder.

The format for folder structure per project is always:
```
📂 hydra-configs-<library-name> # the dir containing the package of configs for <library>
├ 📁 hydra_configs # the namespace we will always use
│ └ 📁 <library-name> # e.g. 'torchvision'
│ └ 📁 <library-subpackage-name> # e.g. 'transforms'
│ ⋮
│ └ <module>.py # e.g. 'transforms.py'
└ setup.py # configures this package for setuptools
```

The beauty of this approach is that users can be sure the importing idiom is reliably:
```python
from hydra_configs.<project_name>.path.to.module import <ClassName>Conf
```
For example:
```python
from hydra_configs.torch.optim import AdamConf
```

This approach ensures interoperability with repositories that define configs for hydra outside of the `hydra-torch` repo as long as they follow this pattern.

##### Metapackage

For convenience, we also provide a metapackage called `hydra-torch`. This is defined at the top level of the repo in `setup.py`. Effectively, all this means is that a user can either install each project-specific package as needed or they can obtain them all in one shot by installing only the metapackage.

### Testing

All configs must be tested to ensure they are compatible with the latest version of Hydra. At a minimum, each config must be able to successfully yield the desired class when passed to `hydra.utils.instantiate()`. If the class has special functionality, consider making evaluating this a goal of the test suite as well.

Tests are run through pytest. The CI flow is currently: `CircleCI` -> `nox` -> `pytest`. All tests must pass before anything can be merged.


### Versions

#### Regarding Libraries

Given the rapid development of libraries in the torch ecosystem, it is likely that there will be a need for multiple simultaneous version specific `hydra-configs-<library-name>` per library. This is especially relevant as API and type annotation is updated moving forward for a particular library.

For each library, there will be releases for each `MINOR` version starting with the latest version available at the time of creation of the configs project.
1. There will be a branch/tag for each version for each library's corresponding configs project. These branches/tags will only contain the content relevant to that particular project.
2. In `master`, each configs project will maintain the latest supported library version.

###### Package/Branch Names:
The version for the `hydra-configs-<library-name>` package will be `MAJOR.MINOR.X` where `MAJOR.MINOR` tracks the library versions and `.X` is reserved for revisions of the configs for that particular `MAJOR.MINOR` should they be required.

> e.g. `hydra-configs-torch==1.6.0` corresponds to `torch==1.6` and if updates are needed in response to patches for `hydra` or `<library>`, only the `PATCH` version will be incremented -> `hydra-config-torch==1.6.1`

#### Regarding Hydra

Many of the features that make using configs compelling are supported in Hydra > 1.1. For this reason, we are aiming support starting at 1.1 . Using an older version of Hydra is not advised, but may still work fine in less complex cases.


### Releases
Releases will be on a per-project basis for specific versions of a project. Releases should be tied to a specific branch/tag in the repo that passes all tests via CI. We release via PyPI.

#TODO: write out exact tag format once decided upon

### Contributing
We encourage contribution from all members of the community! This repo is especially ammenable to community contribution as it covers a wide breadth and can be developed in a distributed fashion. Please see [`CONTRIBUTING.md`](CONTRIBUTING.md) to get started!