The goal of our project was to build a model that can classify 5 types of images:
- Indoor Selfie
- Outdoor Selfie
- Indoor Pose
- Outdoor Pose
- Photos Without Human
The project was divided in 3 phases:
- Data collection and Dataset preparation
- Training models
- Model evaluation and deployment
In order to build a robust classifier, we prepared a custom dataset of 2500 images with a balanced distribution of the five classes, each containing 500 images. Images belonging to different classes were saved in separate folders, with directory structure as following:
Selfie images were collected from a pre-made dataset available at: https://www.crcv.ucf.edu/data/Selfie/, and then divided to indoor/outdoor selfies. The dataset includes both regular and mirror selfies. Other images were collected from various web sources. The dataset is available upon request.
Class | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
Indoor Selfie | |||||
Outdoor Selfie | |||||
Indoor Pose | |||||
Outdoor Pose | |||||
Without Human |
Given the fact that our dataset is fairly small, we didn't get good results with training our own CNN from scratch, so we decided to use transfer learning with pre-trained CNNs. To achieve better results, data augmentation was applied to the dataset.
Validation accuracy results achieved with different pre-trained models:
Model | Epochs | Validation accuracy (%) |
---|---|---|
CNN from scratch | 100 | 51 |
MobileNet | 100 | 91 |
Inception V3 | 100 | 84 |
Xception | 100 | 90 |
ResNet50 | 100 | 28 |
ResNet101 | 50 | 24 |
VGG16 | 100 | 69 |
We can see how in a few lines of code and with a good selection of the pre-trained model, with transfer learning we can get very good results even with a small dataset to train on. However, our baseline CNN written from scratch achieved a better result than some of the pre-trained models we tried.
Best results for our dataset were achieved with transfer learning, using MobileNet pre-trained model.
Overall validation accuracy: 91 %
Relevant metrics for each class:
Class | Precision | Recall | F1-score | AUC |
---|---|---|---|---|
Selfie indoor | 0.95 | 0.83 | 0.89 | 0.98 |
Selfie outdoor | 0.88 | 0.94 | 0.91 | 0.99 |
Pose indoor | 0.85 | 0.92 | 0.88 | 0.98 |
Pose outdoor | 0.90 | 0.90 | 0.90 | 0.99 |
Without human | 0.98 | 0.96 | 0.97 | 1.0 |
Macro avg | 0.91 | 0.91 | 0.91 | 0.99 |
AUC scores of individual classes:
Finally, the model was tested on several images with a function that takes in URL of the image and outputs the probability of the image belonging to each class. Examples are given below:
This project was delivered as a final assignment for a Data Science course at the Brainster Data Science Academy
Team members:
Big thanks goes to:
- Project supervisor: Igor Trpevski
- Brainster Instructors: Blagoj Kostovski, Viktoria Doneva, Filip Nikolovski, Marko Karbevski, Viktor Domazetoski