Determined's adaptive search method implements ASHA, a state-of-the-art method for HP search suitable for large-scale machine learning. In this example, we use Determined's adaptive HP search to search for CNN architectures from a common search space used for neural architecture search (NAS). In particular, we replicate the NAS CNN search benchmark from the ASHA paper (Figure 4), which is a strong baseline for NAS.
We also demonstrate how HP search constraints can be used to boost the performance of adaptive hyperparameter search to nearly match state-of-the-art for this search space.
- model_def.py: The core code for the model. This includes building and compiling the model.
- model.py: The model specification.
- operations.py: The components used to build the model.
- utils.py: Functions from the original repository.
- genotypes.py: Primitive set of model operations.
- const.yaml: Train the model with constant hyperparameter values. Provided architecture was found with constrained hp search and should reach > 97.4% accuracy.
- adaptive.yaml: Perform a hyperparameter search using Determined's state-of-the-art adaptive hyperparameter tuning algorithm.
- constrained_adaptive.yaml: Adaptive HP search over a modified search space using HP constraints.
- constrained_random.yaml: Random HP search over a modified search space using HP constraints.
This example searches for architectures over the CIFAR-10 dataset but is easily adaptable to other datasets.
If you have not yet installed Determined, installation instructions can be found
under docs/install-admin.html
or at https://docs.determined.ai/latest/index.html
Run the following command: det -m <master host:port> experiment create -f adaptive.yaml .
. The other configurations can be run by specifying the appropriate
configuration file in place of adaptive.yaml
.
When running HP search via adaptive.yaml
with 16 V100 GPUs, the best architecture after 1 day should achieve around 97.0% accuracy on CIFAR-10.
When running HP search via constrained_adaptive.yaml
with 16 V100 GPUs, the best architecture after evaluating 1k trials should achieve around 97.3% accuracy on CIFAR-10 as shown in the image below.
Note: For a fair comparison to the NAS results for this search space, you will have to train the best architecture for a total of 600 epochs instead of the 300 epochs used for the HP search experiment.