diff --git a/task/recognition/face/README.md b/task/recognition/face/README.md new file mode 100644 index 0000000000000..89f489941ba8e --- /dev/null +++ b/task/recognition/face/README.md @@ -0,0 +1,134 @@ +# Face Recognition + +The face recognition task is a case of achieving large-scale classification on PLSC, +and the goal is to implement and reproduce the SOTA algorithm. It has +the ability to train tens of millions of identities with high throughput in a single server. + +Function has supported: +* ArcFace +* CosFace +* PartialFC +* SparseMomentum +* FP16 training +* DataParallel(backbone layer) + ModelParallel(FC layer) distributed training + +Backbone includes: +* IResNet +* FaceViT + +## Requirements +To enjoy some new features, PaddlePaddle 2.4 is required. For more installation tutorials +refer to [installation.md](../../../tutorials/get_started/installation.md) + +## Data Preparation + +### Download Dataset + +Download the dataset from insightface datasets. + +- [MS1MV2](https://github.com/deepinsight/insightface/tree/master/recognition/_datasets_#ms1m-arcface-85k-ids58m-images-57) (87k IDs, 5.8M images) +- [MS1MV3](https://github.com/deepinsight/insightface/tree/master/recognition/_datasets_#ms1m-retinaface) (93k IDs, 5.2M images) +- [Glint360K](https://github.com/deepinsight/insightface/tree/master/recognition/partial_fc#4-download) (360k IDs, 17.1M images) +- [WebFace42M](https://github.com/deepinsight/insightface/blob/master/recognition/arcface_torch/docs/prepare_webface42m.md) (2M IDs, 42.5M images) + +Note: +* MS1MV2: MS1M-ArcFace +* MS1MV3: MS1M-RetinaFace +* WebFace42M: cleared WebFace260M + +### [Optional] Extract MXNet Dataset to Images +```shell +# for example, here extract MS1MV3 dataset +python -m plsc.data.dataset.tools.mx_recordio_2_images --root_dir /path/to/ms1m-retinaface-t1/ --output_dir ./dataset/MS1M_v3/ +``` + +### Extract LFW Style bin Dataset to Images +```shell +# for example, here extract agedb_30 bin to images +python -m plsc.data.dataset.tools.lfw_style_bin_dataset_converter --bin_path ./dataset/MS1M_v3/agedb_30.bin --out_dir ./dataset/MS1M_v3/agedb_30 --flip_test +``` + +### Dataset Directory +We put all the data in the `./dataset/` directory, and we also recommend using soft links, for example: +```shell +mkdir -p ./dataset/ +ln -s /path/to/MS1M_v3 ./dataset/MS1M_v3 +``` + +## How to Train + +```bash +# Note: If running on multiple nodes, +# set the following environment variables +# and then need to run the script on each node. +export PADDLE_NNODES=1 +export PADDLE_MASTER="127.0.0.1:12538" +export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 + +python -m paddle.distributed.launch \ + --nnodes=$PADDLE_NNODES \ + --master=$PADDLE_MASTER \ + --devices=$CUDA_VISIBLE_DEVICES \ + plsc-train \ + -c ./configs/IResNet50_MS1MV3_ArcFace_pfc01_1n8c_dp_mp_fp16o1.yaml +``` + +## How to Export + +```bash +# In general, we only need to export the +# backbone, so we only need to run the +# export command on a single device. +export PADDLE_NNODES=1 +export PADDLE_MASTER="127.0.0.1:12538" +export CUDA_VISIBLE_DEVICES=0 +python -m paddle.distributed.launch \ + --nnodes=$PADDLE_NNODES \ + --master=$PADDLE_MASTER \ + --devices=$CUDA_VISIBLE_DEVICES \ + plsc-export \ + -c ./configs/IResNet50_MS1MV3_ArcFace_pfc01_1n8c_dp_mp_fp16o1.yaml \ + -o Global.pretrained_model=./output/IResNet50/latest \ + -o FP16.level=O0 \ # export FP32 model when training with FP16 + -o Model.data_format=NCHW # IResNet required if training with NHWC +``` + +## Evaluation IJB-C +```bash +python onnx_ijbc.py \ + --model-root ./output/IResNet50.onnx \ + --image-path ./ijb/IJBC/ \ + --target IJBC +``` + +## Model Zoo + +| Datasets | Backbone | Config | Devices | PFC | agedb30 | IJB-C(1E-4) | IJB-C(1E-5) | checkpoint | log | +| :------: | :------- | ------------------------------------------------------------ | --------- | ---- | ------- | ----------- | :---------- | :----------------------------------------------------------- | ------------------------------------------------------------ | +| MS1MV3 | Res50 | [config](./configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml) | N1C8*A100 | 1.0 | 0.9825 | 96.52 | 94.60 | [download](https://plsc.bj.bcebos.com/models/face/v2.4/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.pdparams) | [download](https://plsc.bj.bcebos.com/models/face/v2.4/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.log) | + +## Citations + +``` +@misc{plsc, + title={PLSC: An Easy-to-use and High-Performance Large Scale Classification Tool}, + author={PLSC Contributors}, + howpublished = {\url{https://github.com/PaddlePaddle/PLSC}}, + year={2022} +} +@inproceedings{deng2019arcface, + title={Arcface: Additive angular margin loss for deep face recognition}, + author={Deng, Jiankang and Guo, Jia and Xue, Niannan and Zafeiriou, Stefanos}, + booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, + pages={4690--4699}, + year={2019} +} +@inproceedings{An_2022_CVPR, + author={An, Xiang and Deng, Jiankang and Guo, Jia and Feng, Ziyong and Zhu, XuHan and Yang, Jing and Liu, Tongliang}, + title={Killing Two Birds With One Stone: Efficient and Robust Training of Face Recognition CNNs by Partial FC}, + booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, + month={June}, + year={2022}, + pages={4042-4051} +} +``` diff --git a/task/recognition/face/configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml b/task/recognition/face/configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml new file mode 100644 index 0000000000000..cd3ef84ca7890 --- /dev/null +++ b/task/recognition/face/configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml @@ -0,0 +1,126 @@ +# global configs +Global: + task_type: recognition + train_epoch_func: defualt_train_one_epoch + eval_func: face_verification_eval + checkpoint: null + pretrained_model: null + output_dir: ./output/ + device: gpu + save_interval: 1 + max_num_latest_checkpoint: 0 + eval_during_train: True + eval_interval: 2000 + eval_unit: "step" + accum_steps: 1 + epochs: 20 + print_batch_step: 100 + use_visualdl: True + seed: 2022 + +# FP16 setting +FP16: + level: O1 + GradScaler: + init_loss_scaling: 27648.0 + +DistributedStrategy: + data_parallel: True + +# model architecture +Model: + name: IResNet50 + num_features : 512 + data_format : "NHWC" + class_num: 93431 + pfc_config: + sample_ratio: 1.0 + model_parallel: True + +# loss function config for traing/eval process +Loss: + Train: + - MarginLoss: + m1: 1.0 + m2: 0.5 + m3: 0.0 + s: 64.0 + model_parallel: True + weight: 1.0 + +LRScheduler: + name: Poly + learning_rate: 0.1 + decay_unit: step + warmup_steps: 0 + +Optimizer: + name: Momentum + momentum: 0.9 + weight_decay: 5e-4 + grad_clip: + name: ClipGradByGlobalNorm + clip_norm: 5.0 + always_clip: True + no_clip_list: ['dist'] + +# data loader for train and eval +DataLoader: + Train: + dataset: + name: FaceIdentificationDataset + image_root: ./dataset/MS1M_v3/ + cls_label_path: ./dataset/MS1M_v3/label.txt + transform_ops: + - DecodeImage: + to_rgb: True + channel_first: False + - RandFlipImage: + flip_code: 1 + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.5, 0.5, 0.5] + std: [0.5, 0.5, 0.5] + order: '' + - ToCHWImage: + sampler: + name: DistributedBatchSampler + batch_size: 128 + drop_last: False + shuffle: True + loader: + num_workers: 8 + use_shared_memory: True + + Eval: + dataset: + name: FaceVerificationDataset + image_root: ./dataset/MS1M_v3/agedb_30 + cls_label_path: ./dataset/MS1M_v3/agedb_30/label.txt + transform_ops: + - DecodeImage: + to_rgb: True + channel_first: False + - NormalizeImage: + scale: 1.0/255.0 + mean: [0.5, 0.5, 0.5] + std: [0.5, 0.5, 0.5] + order: '' + - ToCHWImage: + sampler: + name: BatchSampler + batch_size: 128 + drop_last: False + shuffle: False + loader: + num_workers: 0 + use_shared_memory: True + +Metric: + Eval: + - LFWAcc: + flip_test: True + +Export: + export_type: onnx + input_shape: [None, 3, 112, 112] diff --git a/task/recognition/face/eval_ijbc.sh b/task/recognition/face/eval_ijbc.sh new file mode 100644 index 0000000000000..137f994ed2d17 --- /dev/null +++ b/task/recognition/face/eval_ijbc.sh @@ -0,0 +1,18 @@ +# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +python onnx_ijbc.py \ + --model-root ./output/IResNet50.onnx \ + --image-path ./ijb/IJBC/ \ + --target IJBC diff --git a/task/recognition/face/export.sh b/task/recognition/face/export.sh new file mode 100644 index 0000000000000..0b95254e0ecef --- /dev/null +++ b/task/recognition/face/export.sh @@ -0,0 +1,26 @@ +# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +export PADDLE_NNODES=1 +export PADDLE_MASTER="127.0.0.1:12538" +export CUDA_VISIBLE_DEVICES=0 +python -m paddle.distributed.launch \ + --nnodes=$PADDLE_NNODES \ + --master=$PADDLE_MASTER \ + --devices=$CUDA_VISIBLE_DEVICES \ + plsc-export \ + -c ./configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml \ + -o Global.pretrained_model=output/IResNet50/latest \ + -o FP16.level=O0 \ + -o Model.data_format=NCHW diff --git a/task/recognition/face/onnx_helper.py b/task/recognition/face/onnx_helper.py new file mode 100644 index 0000000000000..a418a247042e8 --- /dev/null +++ b/task/recognition/face/onnx_helper.py @@ -0,0 +1,281 @@ +# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# code modified from: https://github.com/deepinsight/insightface/blob/master/recognition/arcface_torch/onnx_helper.py + +from __future__ import division +import datetime +import os +import os.path as osp +import glob +import numpy as np +import cv2 +import sys +import onnxruntime +import onnx +import argparse +from onnx import numpy_helper + + +class ArcFaceORT: + def __init__(self, model_path, cpu=False): + self.model_path = model_path + self.model_dir = os.path.dirname(model_path) + # providers = None will use available provider, for onnxruntime-gpu it will be "CUDAExecutionProvider" + self.providers = ['CPUExecutionProvider' + ] if cpu else ['CUDAExecutionProvider'] + + #input_size is (w,h), return error message, return None if success + def check(self, track='cfat', test_img=None): + #default is cfat + max_model_size_mb = 1024 + max_feat_dim = 512 + max_time_cost = 15 + if track.startswith('ms1m'): + max_model_size_mb = 1024 + max_feat_dim = 512 + max_time_cost = 10 + elif track.startswith('glint'): + max_model_size_mb = 1024 + max_feat_dim = 1024 + max_time_cost = 20 + elif track.startswith('cfat'): + max_model_size_mb = 1024 + max_feat_dim = 512 + max_time_cost = 15 + elif track.startswith('unconstrained'): + max_model_size_mb = 1024 + max_feat_dim = 1024 + max_time_cost = 30 + else: + return "track not found" + + if not os.path.exists(self.model_path): + return f"{self.model_path} not exists" + if not os.path.isdir(self.model_dir): + return f"{self.model_dir} should be directory" + + print('use onnx-model:', self.model_path) + try: + session = onnxruntime.InferenceSession( + self.model_path, providers=self.providers) + except Exception as e: + return "load onnx failed" + input_cfg = session.get_inputs()[0] + input_shape = input_cfg.shape + print('input-shape:', input_shape) + if len(input_shape) != 4: + return "length of input_shape should be 4" + if not isinstance(input_shape[0], str): + #return "input_shape[0] should be str to support batch-inference" + print('reset input-shape[0] to None') + model = onnx.load(self.model_path) + model.graph.input[0].type.tensor_type.shape.dim[ + 0].dim_param = 'None' + new_model_path = osp.join(self.model_dir, 'zzzzrefined.onnx') + onnx.save(model, new_model_path) + self.model_path = new_model_path + print('use new onnx-model:', self.model_path) + try: + session = onnxruntime.InferenceSession( + self.model_path, providers=self.providers) + except: + return "load onnx failed" + input_cfg = session.get_inputs()[0] + input_shape = input_cfg.shape + print('new-input-shape:', input_shape) + + self.image_size = tuple(input_shape[2:4][::-1]) + #print('image_size:', self.image_size) + input_name = input_cfg.name + outputs = session.get_outputs() + output_names = [] + for o in outputs: + output_names.append(o.name) + #print(o.name, o.shape) + if len(output_names) != 1: + return "number of output nodes should be 1" + self.session = session + self.input_name = input_name + self.output_names = output_names + #print(self.output_names) + model = onnx.load(self.model_path) + graph = model.graph + if len(graph.node) < 8: + return "too small onnx graph" + + input_size = (112, 112) + self.crop = None + if track == 'cfat': + crop_file = osp.join(self.model_dir, 'crop.txt') + if osp.exists(crop_file): + lines = open(crop_file, 'r').readlines() + if len(lines) != 6: + return "crop.txt should contain 6 lines" + lines = [int(x) for x in lines] + self.crop = lines[:4] + input_size = tuple(lines[4:6]) + if input_size != self.image_size: + return "input-size is inconsistant with onnx model input, %s vs %s" % ( + input_size, self.image_size) + + self.model_size_mb = os.path.getsize(self.model_path) / float(1024 * + 1024) + if self.model_size_mb > max_model_size_mb: + return "max model size exceed, given %.3f-MB" % self.model_size_mb + + input_mean = None + input_std = None + if track == 'cfat': + pn_file = osp.join(self.model_dir, 'pixel_norm.txt') + if osp.exists(pn_file): + lines = open(pn_file, 'r').readlines() + if len(lines) != 2: + return "pixel_norm.txt should contain 2 lines" + input_mean = float(lines[0]) + input_std = float(lines[1]) + if input_mean is not None or input_std is not None: + if input_mean is None or input_std is None: + return "please set input_mean and input_std simultaneously" + else: + find_sub = False + find_mul = False + for nid, node in enumerate(graph.node[:8]): + print(nid, node.name) + if node.name.startswith('Sub') or node.name.startswith( + '_minus'): + find_sub = True + if node.name.startswith('Mul') or node.name.startswith( + '_mul') or node.name.startswith('Div'): + find_mul = True + if find_sub and find_mul: + print("find sub and mul") + #mxnet arcface model + input_mean = 0.0 + input_std = 1.0 + else: + input_mean = 127.5 + input_std = 127.5 + self.input_mean = input_mean + self.input_std = input_std + for initn in graph.initializer: + weight_array = numpy_helper.to_array(initn) + dt = weight_array.dtype + if dt.itemsize < 4: + return 'invalid weight type - (%s:%s)' % (initn.name, dt.name) + assert test_img is not None + test_img = cv2.resize(test_img, self.image_size) + feat, cost = self.benchmark(test_img) + batch_result = self.check_batch(test_img) + batch_result_sum = float(np.sum(batch_result)) + if batch_result_sum in [float('inf'), -float('inf') + ] or batch_result_sum != batch_result_sum: + print(batch_result) + print(batch_result_sum) + return "batch result output contains NaN!" + + if len(feat.shape) < 2: + return "the shape of the feature must be two, but get {}".format( + str(feat.shape)) + + if feat.shape[1] > max_feat_dim: + return "max feat dim exceed, given %d" % feat.shape[1] + self.feat_dim = feat.shape[1] + cost_ms = cost * 1000 + if cost_ms > max_time_cost: + return "max time cost exceed, given %.4f" % cost_ms + self.cost_ms = cost_ms + print( + 'check stat:, model-size-mb: %.4f, feat-dim: %d, time-cost-ms: %.4f, input-mean: %.3f, input-std: %.3f' + % (self.model_size_mb, self.feat_dim, self.cost_ms, + self.input_mean, self.input_std)) + return None + + def check_batch(self, img): + if not isinstance(img, list): + imgs = [img, ] * 32 + if self.crop is not None: + nimgs = [] + for img in imgs: + nimg = img[self.crop[1]:self.crop[3], self.crop[0]:self.crop[ + 2], :] + if nimg.shape[0] != self.image_size[1] or nimg.shape[ + 1] != self.image_size[0]: + nimg = cv2.resize(nimg, self.image_size) + nimgs.append(nimg) + imgs = nimgs + blob = cv2.dnn.blobFromImages( + images=imgs, + scalefactor=1.0 / self.input_std, + size=self.image_size, + mean=(self.input_mean, self.input_mean, self.input_mean), + swapRB=True) + net_out = self.session.run(self.output_names, + {self.input_name: blob})[0] + return net_out + + def meta_info(self): + return { + 'model-size-mb': self.model_size_mb, + 'feature-dim': self.feat_dim, + 'infer': self.cost_ms + } + + def forward(self, imgs): + if not isinstance(imgs, list): + imgs = [imgs] + input_size = self.image_size + if self.crop is not None: + nimgs = [] + for img in imgs: + nimg = img[self.crop[1]:self.crop[3], self.crop[0]:self.crop[ + 2], :] + if nimg.shape[0] != input_size[1] or nimg.shape[ + 1] != input_size[0]: + nimg = cv2.resize(nimg, input_size) + nimgs.append(nimg) + imgs = nimgs + blob = cv2.dnn.blobFromImages( + imgs, + 1.0 / self.input_std, + input_size, (self.input_mean, self.input_mean, self.input_mean), + swapRB=True) + net_out = self.session.run(self.output_names, + {self.input_name: blob})[0] + return net_out + + def benchmark(self, img): + input_size = self.image_size + if self.crop is not None: + nimg = img[self.crop[1]:self.crop[3], self.crop[0]:self.crop[2], :] + if nimg.shape[0] != input_size[1] or nimg.shape[1] != input_size[ + 0]: + nimg = cv2.resize(nimg, input_size) + img = nimg + blob = cv2.dnn.blobFromImage( + img, + 1.0 / self.input_std, + input_size, (self.input_mean, self.input_mean, self.input_mean), + swapRB=True) + costs = [] + for _ in range(50): + ta = datetime.datetime.now() + net_out = self.session.run(self.output_names, + {self.input_name: blob})[0] + tb = datetime.datetime.now() + cost = (tb - ta).total_seconds() + costs.append(cost) + costs = sorted(costs) + cost = costs[5] + return net_out, cost diff --git a/task/recognition/face/onnx_ijbc.py b/task/recognition/face/onnx_ijbc.py new file mode 100644 index 0000000000000..163b10c983493 --- /dev/null +++ b/task/recognition/face/onnx_ijbc.py @@ -0,0 +1,313 @@ +# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# code modified from: https://github.com/deepinsight/insightface/blob/master/recognition/arcface_torch/onnx_ijbc.py + +import argparse +import os +import pickle +import timeit + +import cv2 +import numpy as np +import pandas as pd +import prettytable +import skimage.transform +import paddle +from sklearn.metrics import roc_curve +from sklearn.preprocessing import normalize +from onnx_helper import ArcFaceORT + +SRC = np.array( + [[30.2946, 51.6963], [65.5318, 51.5014], [48.0252, 71.7366], + [33.5493, 92.3655], [62.7299, 92.2041]], + dtype=np.float32) +SRC[:, 0] += 8.0 + + +class AlignedDataSet(paddle.io.Dataset): + def __init__(self, root, lines, align=True): + self.lines = lines + self.root = root + self.align = align + + def __len__(self): + return len(self.lines) + + def __getitem__(self, idx): + each_line = self.lines[idx] + name_lmk_score = each_line.strip().split(' ') + name = os.path.join(self.root, name_lmk_score[0]) + img = cv2.cvtColor(cv2.imread(name), cv2.COLOR_BGR2RGB) + landmark5 = np.array( + [float(x) for x in name_lmk_score[1:-1]], + dtype=np.float32).reshape((5, 2)) + st = skimage.transform.SimilarityTransform() + st.estimate(landmark5, SRC) + img = cv2.warpAffine( + img, st.params[0:2, :], (112, 112), borderValue=0.0) + img_1 = np.expand_dims(img, 0) + img_2 = np.expand_dims(np.fliplr(img), 0) + output = np.concatenate((img_1, img_2), axis=0).astype(np.float32) + output = np.transpose(output, (0, 3, 1, 2)) + return paddle.to_tensor(output) + + +@paddle.no_grad() +def extract(model_root, dataset): + model = ArcFaceORT(model_path=model_root) + test_img = np.zeros((112, 112, 3), dtype=np.uint8) + status = model.check(test_img=test_img) + if status is not None: + print(status) + exit(-1) + feat_mat = np.zeros(shape=(len(dataset), 2 * model.feat_dim)) + + def collate_fn(data): + return paddle.concat(data, axis=0) + + data_loader = paddle.io.DataLoader( + dataset, + batch_size=128, + drop_last=False, + num_workers=4, + collate_fn=collate_fn) + num_iter = 0 + for batch in data_loader: + batch = batch.numpy() + batch = (batch - model.input_mean) / model.input_std + feat = model.session.run(model.output_names, + {model.input_name: batch})[0] + feat = np.reshape(feat, (-1, model.feat_dim * 2)) + feat_mat[128 * num_iter:128 * num_iter + feat.shape[0], :] = feat + num_iter += 1 + if num_iter % 50 == 0: + print(num_iter) + return feat_mat + + +def read_template_media_list(path): + ijb_meta = pd.read_csv(path, sep=' ', header=None).values + templates = ijb_meta[:, 1].astype(np.int) + medias = ijb_meta[:, 2].astype(np.int) + return templates, medias + + +def read_template_pair_list(path): + pairs = pd.read_csv(path, sep=' ', header=None).values + t1 = pairs[:, 0].astype(np.int) + t2 = pairs[:, 1].astype(np.int) + label = pairs[:, 2].astype(np.int) + return t1, t2, label + + +def read_image_feature(path): + with open(path, 'rb') as fid: + img_feats = pickle.load(fid) + return img_feats + + +def image2template_feature(img_feats=None, templates=None, medias=None): + unique_templates = np.unique(templates) + template_feats = np.zeros((len(unique_templates), img_feats.shape[1])) + for count_template, uqt in enumerate(unique_templates): + (ind_t, ) = np.where(templates == uqt) + face_norm_feats = img_feats[ind_t] + face_medias = medias[ind_t] + unique_medias, unique_media_counts = np.unique( + face_medias, return_counts=True) + media_norm_feats = [] + for u, ct in zip(unique_medias, unique_media_counts): + (ind_m, ) = np.where(face_medias == u) + if ct == 1: + media_norm_feats += [face_norm_feats[ind_m]] + else: # image features from the same video will be aggregated into one feature + media_norm_feats += [ + np.mean( + face_norm_feats[ind_m], axis=0, keepdims=True), + ] + media_norm_feats = np.array(media_norm_feats) + template_feats[count_template] = np.sum(media_norm_feats, axis=0) + if count_template % 2000 == 0: + print('Finish Calculating {} template features.'.format( + count_template)) + template_norm_feats = normalize(template_feats) + return template_norm_feats, unique_templates + + +def verification(template_norm_feats=None, + unique_templates=None, + p1=None, + p2=None): + template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int) + for count_template, uqt in enumerate(unique_templates): + template2id[uqt] = count_template + score = np.zeros((len(p1), )) + total_pairs = np.array(range(len(p1))) + batchsize = 100000 + sublists = [ + total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize) + ] + total_sublists = len(sublists) + for c, s in enumerate(sublists): + feat1 = template_norm_feats[template2id[p1[s]]] + feat2 = template_norm_feats[template2id[p2[s]]] + similarity_score = np.sum(feat1 * feat2, -1) + score[s] = similarity_score.flatten() + if c % 10 == 0: + print('Finish {}/{} pairs.'.format(c, total_sublists)) + return score + + +def verification2(template_norm_feats=None, + unique_templates=None, + p1=None, + p2=None): + template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int) + for count_template, uqt in enumerate(unique_templates): + template2id[uqt] = count_template + score = np.zeros((len(p1), )) # save cosine distance between pairs + total_pairs = np.array(range(len(p1))) + batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation + sublists = [ + total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize) + ] + total_sublists = len(sublists) + for c, s in enumerate(sublists): + feat1 = template_norm_feats[template2id[p1[s]]] + feat2 = template_norm_feats[template2id[p2[s]]] + similarity_score = np.sum(feat1 * feat2, -1) + score[s] = similarity_score.flatten() + if c % 10 == 0: + print('Finish {}/{} pairs.'.format(c, total_sublists)) + return score + + +def main(args): + use_norm_score = True # if Ture, TestMode(N1) + use_detector_score = True # if Ture, TestMode(D1) + use_flip_test = True # if Ture, TestMode(F1) + assert args.target == 'IJBC' or args.target == 'IJBB' + + start = timeit.default_timer() + templates, medias = read_template_media_list( + os.path.join('%s/meta' % args.image_path, '%s_face_tid_mid.txt' % + args.target.lower())) + stop = timeit.default_timer() + print('Time: %.2f s. ' % (stop - start)) + + start = timeit.default_timer() + p1, p2, label = read_template_pair_list( + os.path.join('%s/meta' % args.image_path, '%s_template_pair_label.txt' + % args.target.lower())) + stop = timeit.default_timer() + print('Time: %.2f s. ' % (stop - start)) + + start = timeit.default_timer() + img_path = '%s/loose_crop' % args.image_path + img_list_path = '%s/meta/%s_name_5pts_score.txt' % (args.image_path, + args.target.lower()) + img_list = open(img_list_path) + files = img_list.readlines() + dataset = AlignedDataSet(root=img_path, lines=files, align=True) + img_feats = extract(args.model_root, dataset) + + faceness_scores = [] + for each_line in files: + name_lmk_score = each_line.split() + faceness_scores.append(name_lmk_score[-1]) + faceness_scores = np.array(faceness_scores).astype(np.float32) + stop = timeit.default_timer() + print('Time: %.2f s. ' % (stop - start)) + print('Feature Shape: ({} , {}) .'.format(img_feats.shape[0], + img_feats.shape[1])) + start = timeit.default_timer() + + if use_flip_test: + img_input_feats = img_feats[:, 0:img_feats.shape[1] // + 2] + img_feats[:, img_feats.shape[1] // 2:] + else: + img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2] + + if use_norm_score: + img_input_feats = img_input_feats + else: + img_input_feats = img_input_feats / np.sqrt( + np.sum(img_input_feats**2, -1, keepdims=True)) + + if use_detector_score: + print(img_input_feats.shape, faceness_scores.shape) + img_input_feats = img_input_feats * faceness_scores[:, np.newaxis] + else: + img_input_feats = img_input_feats + + template_norm_feats, unique_templates = image2template_feature( + img_input_feats, templates, medias) + stop = timeit.default_timer() + print('Time: %.2f s. ' % (stop - start)) + + start = timeit.default_timer() + score = verification(template_norm_feats, unique_templates, p1, p2) + stop = timeit.default_timer() + print('Time: %.2f s. ' % (stop - start)) + result_dir = args.result_dir + + save_path = os.path.join(result_dir, "{}_result".format(args.target)) + if not os.path.exists(save_path): + os.makedirs(save_path) + score_save_file = os.path.join(save_path, "{}.npy".format(args.target)) + np.save(score_save_file, score) + print(f'Save the result to {score_save_file}') + files = [score_save_file] + methods = [] + scores = [] + for file in files: + methods.append(os.path.basename(file)) + scores.append(np.load(file)) + methods = np.array(methods) + scores = dict(zip(methods, scores)) + x_labels = [10**-6, 10**-5, 10**-4, 10**-3, 10**-2, 10**-1] + tpr_fpr_table = prettytable.PrettyTable(['Methods'] + + [str(x) for x in x_labels]) + for method in methods: + fpr, tpr, _ = roc_curve(label, scores[method]) + fpr = np.flipud(fpr) + tpr = np.flipud(tpr) + tpr_fpr_row = [] + tpr_fpr_row.append("%s-%s" % (method, args.target)) + for fpr_iter in np.arange(len(x_labels)): + _, min_index = min( + list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr))))) + tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100)) + tpr_fpr_table.add_row(tpr_fpr_row) + print(tpr_fpr_table) + + +if __name__ == '__main__': + parser = argparse.ArgumentParser(description='do ijb test') + # general + parser.add_argument('--model-root', default='', help='path to load model.') + parser.add_argument( + '--image-path', + default='/train_tmp/IJB_release/IJBC', + type=str, + help='') + parser.add_argument( + '--result-dir', default='./output', help='path to save the results.') + parser.add_argument( + '--target', + default='IJBC', + type=str, + help='target, set to IJBC or IJBB') + main(parser.parse_args()) diff --git a/task/recognition/face/train.sh b/task/recognition/face/train.sh new file mode 100644 index 0000000000000..8527be9a8fb72 --- /dev/null +++ b/task/recognition/face/train.sh @@ -0,0 +1,33 @@ +# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# for single card training +# CUDA_VISIBLE_DEVICES=0 +# plsc-train -c ./configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml + +# for multi-node and multi-cards training +# export PADDLE_NNODES=2 +# export PADDLE_MASTER="192.168.210.1:12538" +# export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 + +# for single-node and multi-cards training +export PADDLE_NNODES=1 +export PADDLE_MASTER="127.0.0.1:12538" +export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 +python -m paddle.distributed.launch \ + --nnodes=$PADDLE_NNODES \ + --master=$PADDLE_MASTER \ + --devices=$CUDA_VISIBLE_DEVICES \ + plsc-train \ + -c ./configs/IResNet50_MS1MV3_ArcFace_pfc10_1n8c_dp_mp_fp16o1.yaml