I pruned 88% of channels of efficientNetV2 using LAMP pruning resulting 14x sammler model, 2.5x faster in inference, 14x fewer parameters & 2.5x fewer MACops. The loss on accuracy was somewhat 2% only.
After that I replaced all the SiLU and sigmoid layers with Hard Swish and Hardsigmoid as the model is to be deployed in FPGA.
Kaggle Notebook: Link
Fig: : Efficiency Improvements after channel pruning
Through the application of channel-based LAMP, we achieved an impressive 88% reduction in channels within the EfficientNetV2 model. This transformation yielded a model that is not only 14 times smaller in size but also demonstrated a remarkable 2.5- fold improvement in inference speed. Furthermore, it showcased a substantial reduction with 14 times fewer parameters and 2.5 times fewer MAC operations, all while incurring just a modest 2% decrease in accuracy. These outcomes not only underscore the effectiveness of channel pruning but also emphasize the exciting potential for the development of leaner and faster deep learning models that can be deployed in edge devices without significant sacrifices in performance.
Pruned Model: model.pth
This was done as a part of a major project which involved pruning and quantizing efficientnetV2 and then implementing it on FPGA
Quantization of Pruned Model: Github Repo