Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification and object detection. Recent developments in neural network approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of neural-network based deep learning methods for computer vision. We will cover learning algorithms, neural network architectures, and practical engineering tricks for training and fine-tuning networks for visual recognition tasks.
The first half of the course will cover the fundamental components that drive modern deep learning systems for computer vision:
- Linear classifiers
- Stochastic gradient descent
- Fully-connected networks
- Convolutional networks
- Recurrent networks
In the second half of the course we will discuss applications of deep learning to different problems in computer vision, as well as more emerging topics. During this second half the tone of the course will shift slightly towards a seminar: we will omit some details of the systems we discuss, instead focusing on the core concepts behind those applications. We will touch topics such as:
- Attention and transformers
- Object detection
- Image segmentation
- Video classification
- Generative models (GANs, VAEs, autoregressive models)
As the BEST course I've ever taken, I really appreciate the effort of all instructors. It's a great start point for anyone who wants to step into CV-related industry or academia. For public course materials, you can visit EECS 498.008 / 598.008 Deep Learning for Computer Vision website.