Dataset & Dataloader

Retinaface, dlib face detection -> else: centercrop: VERY IMPORTANT. Mask position on face determines incorrect vs correct
- Mask detection model (mobilenet) -> face extraction
Make face mask dataset
- 60살 이상의 노인 이미지를 인터넷에서 가져오는 게 필요할 것 같다. Autocrawler 이용하자
- Facial Landmark Reference 1, Facial Landmark Reference 2
- 노인 이미지를 갖고 오고 Facial Landmark 기반으로 마스크 씌우자 | Reference
- Get mask & face dataset, move mask downwards, make it as incorrect dataset
Facial Landmark를 점으로 직접 이미지들에 찍을까? 그러면 AI 모델이 점의 RGB 값을 받고 이를 "신체 부위"로 학습을 할텐데.
- 얼굴은 다 보이지 않는데 facial landmark를 일부분 찍을 방법이 있으려나?: (Mask가 있는데 입술만 보인다) or (Mask가 있는데 코가 보인다) -> incorrect
- 코의 점 색깔과, 눈의 점 색깔과, 턱의 점 색깔을 다르게 부여하는 게 좋겠다. 다만 데이터셋에서 잘 등장하지 않는 unique color들이어야 할텐데.
Apply Different types of transformation to age/gender/mask
add 59 years old, 58 years old to 60 years old and above class
- Age Distribution(Age <= 20, Age <58, else)로 분류를 하는 게 좋을 것 같다.
Kfold
TTA
Cutmix

Training

연주님: 하나의 모델에서 레이어를 각각각 써서 하나의 결과값으로 나오게 하는 것.
Use ArcFaceLoss
Use SGD, SGDP as optimizer. SGD outperforms Adam. -> But AdamP outperforms SGDP
early stopping on age (epoch 10 is too much already) -> Epoch 10 is okay for 18 classes classification
Try Multimodal of ViT on Age, EfficientNet on Mask and Gender
Applying weights for mask (mask 5: incorrect 1: correct 1) to CrossEntropyLoss criterion. Compare with Focal Loss.

criterion_weighted = nn.CrossEntropyLoss(weight=class_weights,reduction='mean')
loss_weighted = criterion_weighted(x, y)

Multilabel task가 정답이 두 개 이상일 태스크를 의미하는 것 같은데. 이걸 접근 방식을 18개 중에 하나가 아니라, 3가지 유형을 정해놓고, 정답을 내보자. Multilabel Task를 공부하면 도움이 될 것 같습니다.

Make Inference Function
Fix Resnet Code
Apply Resnet Code
reallocate local dataset loader according to right path format(input/data not input/)
reallocate local dataset including jpeg, png format
Change the code, clean dataset on upstage server
check class order vs output prediction order -> fixed with class_to_idx
Make dev environment on colab
should i set shuffle=True for test data loader? -> No.
Start making from dataloader
- Getting F1 score
- Apply Augmentation (imgaug) -> Albumentation으로 대체
Apply ResNext50 example 1
Apply EfficientNet
Apply ViT
- https://github.com/lukemelas/PyTorch-Pretrained-ViT
- https://github.com/lucidrains/vit-pytorch
label 0 and 1 as classes, not integers. too time consuming to figure out which is which
Training set에 대해서 모든 Class에 대한 데이터 개수를 동일하게 설정하려고 했는데(Sampling) 그게 올바르지 않는 접근 방법이었다.
9:1이랑 8:2의 dataset 나누기 차이가 없었다.