Experiments to boost GPU device to PyTorch training
- Automatic Mixed Precision FP16
- Activate Tensor Core
- Distribution of params on Tensorboard
-
Tensor Core https://www.nvidia.com/en-us/data-center/tensor-cores/
-
Turing Structure https://www.nvidia.com/en-us/design-visualization/technologies/turing-architecture/
-
Tips for optimization Blog https://developer.nvidia.com/blog/optimizing-gpu-performance-tensor-cores/
-
Memory limited layer https://docs.nvidia.com/deeplearning/performance/dl-performance-memory-limited/index.html
-
Automatic Mixed Precision for Deep Learning https://developer.nvidia.com/automatic-mixed-precision
-
Mixed Precision Training https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html
-
Introducing Native Pytorch amp for faster training on nvidia gpus https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/
-
Training Neural Networks with Tensor Cores PPT https://nvlabs.github.io/eccv2020-mixed-precision-tutorial/files/dusan_stosic-training-neural-networks-with-tensor-cores.pdf
-
Nvidia Developer Blog https://developer.nvidia.com/deep-learning
-
Nvidia Deep Learning Examples https://developer.nvidia.com/deep-learning-examples
-
Using Nsight Compute or Nvprof to show mixed precision use in deep learning https://developer.nvidia.com/blog/using-nsight-compute-nvprof-mixed-precision-deep-learning-models/
-
Cuda Pro Tip: nvprof is your handy universal gpu profiler https://developer.nvidia.com/blog/cuda-pro-tip-nvprof-your-handy-universal-gpu-profiler/
-
PyTorch tutorial Automatic Mixed Precision https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html