Pytorch Siamese NN based on ResNet/RegNet
with some tweaks to work with dataset crawled from internet.
It compares two images (mostly persons photos) to classify them as similar or not (the same photo or not).
Labeling done by Active learning:
- First, created manually 100 examples
- Train the model (overfitting is okay) for the beginning
- Use model to get predictions on 1000+ more examples
- Take 100 with the highest error and tune the model
- Repeat till the dataset is ready
These images should be considered as similar, regardless images have diff size, color, background.
These images should be considered as different
It means, the NN should pay attention to the main character's object on the photo, and should disregard the background.
The goal to implement the NN, which is able to detect the difference.
- add data aug
- track f1 score
- Optimize model to work on CPU
- Prepare model for inference