In 2012, AlexNet was a major breakthrough for the field of deep learning. It greatly outperformed previous methods on the LSVRC-2010 ImageNet dataset which began a deep learning revolution. Since then, many new modifications to improve neural networks have been proposed without significant changes to the size of the network. In this paper, we explore the following modifications on the original AlexNet architecture: learning rate, optimizers, weight regularization, activation functions, and dropout. Each of these modifications were isolated from each other to observe their individual effects, then consolidated into one final model with the optimal settings. Out of the experiments, our results show that a learning rate of 1e-4 is optimal, Adam is the optimal optimizer, Elastic Net with L1=1e-40.05 and L2=1e-40.95 is the optimal weight regularization, LeakyReLU is the optimal activation function, and p=0.25 is the optimal dropout. Finally, we used the best techniques and hyperparameters to trained a consolidated model.
This section will cover the requirements to set up the training environment.
- Install PyTorch Version 1.11.0 with CUDA 11.3
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
- Install the required packages specified in requirements.txt
pip install -r requirements.txt
To reproduce the weight regularization experiment results, run the following script
python main_weight_regularization.py
The model .pt
files will be saved in the ./models
directory, and the plots will be saved in the ./images
directory.
To reproduce the activation function experiment results, run the following script
python main_activation.py
The model .pt
files will be saved in the ./models
directory, and the plots will be saved in the ./images
directory.
To repoduce the learning rate experiment results, run the following script
python main_learning_rate.py
The model .pt
files will be saved in the ./models
directory, and the plots will be saved in the ./images
directory.
To repoduce the optimizer experiment results, run the following jupyter notebook
main_optimizer.ipynb
By default, the model .pt
files will be saved in the MIE424Project/models
directory on Google Drive, and the output training loss, training accuracy, and validation accuracy numbers will be saved in the MIE424Project/Results
directory on Google Drive.
To run the experiments locally, edit the model_save_path
and result_save_path
variables to a local directory.