Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ported Image Processing Blogs #7

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions _posts/2017-08-01-image-processing-1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
layout: post
title: Image Processing Part 1
tags: image_processing
description: An introduction to image processing
---

-- [Rohit Bhaskar](https://github.com/rohitbhaskar)

<p align="center"><img src="/assets/posts/image-processing-1/image_1.webp"></p>

This 1st post will give you an introduction to IP whereas the 2nd and 3rd will give you in-depth knowledge about the math and coding behind it.
Getting straight to the point, the 3 main types of image processing are digital, analog and optical. We will only be taking about DIP. DIP is a form of signal processing in which the input is an image and the output can be write an image or some characteristics of the image. The image is treated as a 2D signal. **Thus IP is a method to subject an image to certain process so as to retrieve information from it.**
All the topics you would have heard of, such as image compression, image segmentation and computer vision all fall under IP. (Don’t worry if you haven’t heard of them 🙂 Cause you will)
Why is IP so important?
Image Processing has very wide applications and impacts almost every other field of technology. IP can be used to analyse signals that we cant see by our naked eyes (such as IR spectroscopy and UV imaging). Other applications include:
*Image sharpening and restoration*
*Medical field*
*Remote sensing*
*Transmission and encoding*
*Machine/Robot vision*
*Color processing*
*Pattern recognition*
*Video processing*
*Microscopic Imaging*

Basically anywhere that you can work with an image (or a 2d signal) 🙂
Image processing is really awesome when you think about the stuff you can do with it!! Get started on reading up on it “here” to see the very basic but awesome things that you can do with IP.

That’s all for this post guys! See you in the next one where i tell you about the OpenCV IP library and coding for IP and also tell you about Computer Vision!!
42 changes: 42 additions & 0 deletions _posts/2017-08-01-image-processing-2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
layout: post
title: Image Processing Part 2
tags: image_processing
description: Introduction to coding aspect of Image Processing.
---

-- [Rohit Bhaskar](https://github.com/rohitbhaskar)

<p align="center"><img src="/assets/posts/image-processing-2/image_1.webp"></p>

This post will cover the coding aspect of Image Processing (computer science students should like this ;p), to tell you the platforms to use and how to get started.

Firstly, understand that IP and Computer Vision are the same (if anything, CV is a subset of IP). The Matlab projects link in the last post should have given you an idea about the new realm computer vision opens up for you.
Now, [OpenCV (Open Source Computer Vision)](http://opencv.org/) is a library [in both c++ and python] that has algorithms and functions that help in IP. The library has more than 2500 optimized algorithms. They can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, and Much More!!!
The OpenCV lib is used by many big companies (including Google and Microsoft:) )across the globe. Though it is written in c++ it has bindings in python, java and Matlab, and also had wrappers for Ruby, Perl and C#.



If you know c++, the best place to get started in the official website. See the installation procedure for Windows (visual studio) and Linux below.

<iframe width="480" height="320" src="https://www.youtube.com/embed/cgo0UitHfp8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>


For ubuntu 14.04 refer the vid. below…

<iframe width="480" height="320" src="https://www.youtube.com/embed/DYTfwThePBw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

On the official website browse to the [tutorials page](http://docs.opencv.org/2.4/doc/tutorials/tutorials.html) and get started. A tip for learning fast…
*Step1. Read one tutorial fully and understand all the components used and syntax
Step2. Implement the tutorial and refer back to the site if you have a doubt.
Step3. After you finish the tutorials upto pick up [this project](https://github.com/Param-Uttarwar/SimpleHandTracking-openCV) and try and do it yourself!*

The whole thing should take you like a week or so but it will definitely be worth your time.



(You can also get started on Matlab… which is what I’ll be telling you about in the next post, when I start to tell you the math involved 😉




28 changes: 28 additions & 0 deletions _posts/2017-08-01-image-processing-3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
layout: post
title: Image Processing Part 3
tags: image_processing computer_vision
description: Introduction to the math that is used in IP algorithms.
---

-- [Rohit Bhaskar](https://github.com/rohitbhaskar)

<p align="center"><img src="/assets/posts/image-processing-3/image_1.webp"></p>

**Math is everywhere!** Starting from the ‘paint’ software on your pc, all the way to ‘neutral networks’ (I mentioned it on purpose ;p. Ill be having a detailed post on it later)

Everything in IP is based on math too. From every function to every algorithm all use complex mathematical operations to achieve what they do. Even if you merge 2 images, there is an algorithm that is using math to ‘split’ the images into pixels, ‘compare’ each pixels of the 2 and do a weighted ‘addition’! This understanding what goes on behind the scenes is equally important as knowing how to write code. (This is especially for those who thought ‘coding is all I need’ )
OpenCV’s website gives an [reference documentation](http://docs.opencv.org/2.4/modules/refman.html) for all of its functions, which also contain an explanation of how the function achieves what it does… This should get you started.

If you have Matlab installed then install the image processing and computer vision toolbox. Matlab has a lot of inbuilt functions for IP. View them in the Matlab help, or see here…

<iframe width="480" height="320" src="https://www.youtube.com/embed/w658E77PQ4s" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Computer Vision specialists all make use of in-depth mathematical knowledge to make their algorithms. Every algorithm uses maths. A lot of times the base is Wavelet Transform and fourier transforms.

If you’re really interested there are a lot of channels on YouTube that cover the math behind image processing.

<iframe width="480" height="320" src="https://www.youtube.com/embed/IcBzsP-fvPo" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

That’s all for the Image Processing posts, hope it helps you get started !!

47 changes: 47 additions & 0 deletions _posts/2017-08-27-haar-cascades.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
layout: post
title: Haar Cascades Explained in 2 minutes!
tags: image_processing
description: Introduction to Haar Cascades.
---

-- [Apoorva Gokhale](https://github.com/apoorva-21)


<p align="center"><img src="/assets/posts/haar-cascades/haar_on_faces.jpg"></p>

Hello everyone! in my debut post on the SRA blog, I'll be giving you a simplified intuition about Haar-like features, and subsequently, about Haar Cascades, a technique that is applied in Computer Vision for detecting objects in images.

Object detection is one of the most common tasks in Computer Vision. And it’s not always necessary that the object you want to track is going to be delightfully coloured in a different colour than its surroundings, lighting variations, yada, yada.. Chuck that, you may even not have a 3-channel image, just 2-D grayscale matrices.

**We need something more robust, as compared to simple colour thresholds and template matching (which needs the size of the object in the input image, to be the same as in the template, or the image of the object to search for). What if, we could 'teach' our computer what features to look for? (yess, machine learning-ish tease)**

Enter Haar-like features. Cardinally of three types, these are at the crux of what we want to instruct our computer to look for in an image. This was the method that was adopted by Viola and Jones, in their object detection framework in 2001, and was the first nifty-yet-robust approach taken.

<p align="center"><img src="/assets/posts/haar-cascades/haar_features.jpg"></p>

We shall consider grayscale images for this one, since we've realised colour isn’t going to help here, so why triple the size of our image? (rhetoric)

Now, we are going to take a 2-D matrix (imagine a window) like one of those images above, and we are going to place them, part wise, on the image of the object that we want to detect later (imagine the 'window' placed over your object). Next, we are going to sum over all the pixels in the object image that lie under the black part of the window, and separately sum over all pixels under the white part of the window (for CNN junkies, yes, it's like a convolutional kernel). Moving the window throughout the image, we are going to compute these differences of sums. It is using these features that the shape of the object can be decomposed to changes in pixel intensities, allowing us to detect the shape of the object.


But there’s a catch. The resolution of our feature ‘window’ is 24 x 24, which adds up to 160,000 rectangular features. That sounds quite heavy to compute, doesn’t it? In order to optimize this, Viola and Jones came up with what they call integral images, like so:

<p align="center"><img src="/assets/posts/haar-cascades/integral_sum.png"></p>

What this does, at a high level, is that it reduces the computational time greatly, and hence adds to the speed of detection. Moreover, these rectangular features can now be evaluated by the ‘integral image’ method over various resolutions of the image rapidly, even more so than the construction of image pyramids, used in other feature detection schemes.


Now that we have computed the 160,000 features, we need to put them in action and use them to detect objects in images! But slow down, isn’t it going to be a tardy piece of work, computing 160,000 features for every image, even if the object isn’t even in the image? **To prevent this wasted effort, a ‘cascade’ of weak classifiers**, that is, sort of an order of evaluating and checking for features, and the weightage or importance to be given to them, is trained using an algorithm called AdaBoost (short for adaptive boosting). Only if the image has a significant result in one rectangular feature kernel, will it be cascaded down to the next-in-line computation of features. This ensures that redundant computations are kept to a minimum. Fun Fact: the 160,000 features are rarely ever evaluated, unless the object is actually present in that image.


Okay, so all that is left now for us to do is to train the classifier by feeding it with positive samples (images containing the object to be detected) and negative samples (images of the common surroundings, without the object to be detected). The weights and order of cascading gets trained, and is stored in serialized form (.xml files). Helper functions, such as the detectMultiScale function of OpenCV, allow us to use the .xml files to search and localize the object within a new input image, that wasn’t within the training set.


*I hope that with this post, I was able to give you an idea of how Haar-like Features and Cascade Classifiers work. These are some useful links, to help you implement and check out Haar Cascades.*



In case, like me, you get a kick out of seeing rectangles around faces :p :
* [OpenCV example showing Haar Cascadesfor face detection:](http://docs.opencv.org/trunk/d7/d8b/tutorial_py_face_detection.html)
* [Robust Real-time Face Detection, byPaul Viola and Michael Jones (the 2001 paper): ](http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf)
Binary file added assets/posts/haar-cascades/haar_features.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/posts/haar-cascades/haar_on_faces.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/posts/haar-cascades/integral_sum.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/posts/image-processing-1/image_1.webp
Binary file not shown.
Binary file added assets/posts/image-processing-2/image_1.webp
Binary file not shown.
Binary file added assets/posts/image-processing-3/image_1.webp
Binary file not shown.