This set of three lectures will introduce you to important foundational concepts in computer vision. This are "classical" topics but which we believe are important to understand, even in the modern deep-learning era.
There are four sets of learning resources for each topic that I cover, described in a bit more detail below.
I teach general principles but to put the ideas into practice we need to write code. There are myriad choices of language and library/package/toolbox to choose from. In the past I've done a lot in MATLAB but now I'm working with Python, and Python is what we will use for the summer school.
The PDFs of my lecture slides are provided in advance. Feel free to load them into your tablet to annotate as we go along.
The material that I present is covered in more detail in my book Robotic, Vision & Control, 3rd edition 2023. There are two versions of this book:
- Robotic, Vision & Control: Fundamental algorithms in Python
- Robotic, Vision & Control: Fundamental algorithms in MATLAB
The books are very similar in chapter structure and content, the first is based on Python code and open-source packages, the second is based on MATLAB and propietrary toolboxes that you need to licence from MathWorks (most universities will give you the required licences). It's just a matter of personal preference.
If you are studying at a university it is highly likely that you can download - for free - the chapters of this book from the links above. For this course, just grab the indicated chapters (11, 12, 13, 15) from the latter part of the book
Feel free to grab any other chapters that might take your fancy. Chapter 2 is a good (I think) introduction to representing position and orientation in 3D space, Appendix B is a convise refresher on linear algebra and geometry.
There are a set of free online video resources (the QUT Robot Academy) that might be useful as a refresher.
- Homogeneous coordinates (5 mins)
- Position, orientation & pose in 3D space (multiple lessons, 60 mins total)
Code examples in these videos are done with MATLAB, but underneath each video is a code tab, and below that is a tab that allows you to select a "translation" of the code used in the video to different languages and toolboxes.
I will mention other, lecture-specific, Robot Academy videos below.
I provide a selection of Jupyter/Python notebooks that will help to embed knowledge from each lecture. You can run them on Google Colab, with zero install, by clicking the button below.
Alternatively, you can run them locally on your laptop, and that requires that you first install the Machine Vision Toolbox for Python
pip install machinevisiontoolbox
Python 3.9 or newer is recommended. This will install all the required dependencies (including OpenCV) as well as example images for the exercises.
I would highly recommend that you use Miniconda and create an environment for your RVSS code.
conda create -n RVSS python=3.10
conda activate RVSS
pip install machinevisiontoolbox
This lecture introduces the fundamentals of image processing. Topics include pixels and images, image arithmetic, spatial operations such as convolution, and operations on images to find motion, simple blob objects, and image features.
-
Robotics, Vision & Control: Chapters 11 and 12
-
Robot Academy video masterclasses (each is a collection of short videos, ~1h total run time)
-
Jupyter/Python Notebooks
image-processing.ipynb
, fundamentals of image processing as discussed in the lectureimage-features.ipynb
, fundamentals of corner features as discussed in the lecturefinding-blobs.ipynb
, extension to blob finding and blob parametersfiducials.ipynb
, extension to finding ArUco markers (QR-like codes) in an image
This lecture introduces the process of image formation, how the 3D world is projected into a 2D image. Topics include central projection model, homographies, and camera calibration.
-
Robotics, Vision & Control: Section 13.1
-
Robot Academy video masterclasses (each is a collection of short videos, ~1h total run time)
-
Jupyter/Python Notebooks
camera_animation.ipynb
, interactive animation of point projection for central projection modelcamera.ipynb
,homogeneous-coords.ipynb
, refresher on homogeneous coordinates including an interactive animationcalibration2d.ipynb
, extension, calibrating a camera using a set of chequerboard imagesfiducials.ipynb
, extension, finding the pose and identity of ArUco (QR codes) in an image
This lecture is concerned with the relationship between motion of a camera and the corresponding changes in the image. This can provide information about the 3D nature of the world, and can also be inverted to determine how a camera should move in order to create the desired change in an image.
-
Robotics, Vision & Control: Chapter 15
-
Robot Academy video masterclasses (each is a collection of short videos, ~1h total run time)
-
Jupyter/Python Notebooks
image-motion.ipynb
, extension, finding the pose and identity of ArUco (QR codes) in an imagefiducials.ipynb
, extension, finding the pose and identity of ArUco (QR codes) in an image