Self-adversarial training - data augmentation #5117

AlexeyAB · 2020-03-26T14:21:45Z

I added Self-adversarial training.
How to use:

[net]
adversarial_lr=1
#attention=1  # just to show attention

Note for Classifier: it seems it makes training unstable for high learning rate, so you should train 50 of iteratios the model as usual, then add adversarial_lr=0.05 and continue training.

Explanation: If we run attention-algorithm or adversarial-attack-algorithm, then we find that network looks only at rare areas of the object, since it considers them to be the most important, but often the network makes mistakes - these parts of the object are not the most important or do not belong to the object at all, and the network does not notice other details of the object.

Our goal: to make the network take into account a large area

A way to achieve the goal: during training, for every second iteration, the network conducts an Adversarial attack on itself:

In the first forward-backward pass, the network tries to remove from the image all the details that relate to the objects, and makes it believe itself that there is not a single object in the image.
In the second run forward-backward, the network teaches its weights that there are still objects here, despite the fact that it seems to it that it is not here.

For example: default yolov3.cfg/weights

Adversarial attack	Attention during training	Attention during training on Adversarial-attacked image
train already trained model for 500 iteration, but optimize the input image instead of weights (weights are freezed) #5105	`[net] adversarial_lr=0.05 attention=1` the network sees dog/bicycle/car	`[net] adversarial_lr=0.05 attention=1` (image from the first column) the network sees cat here, without dog/bicycle/car

As you can see in the edited image (adversarial attack) in the 1st/3d column, the network doesn't pay attention on dog/bicycle/car, because network thinks that that there are no dog/bicycle/car, and there is a cat instead of a dog. So network should be trained on this augmented image to pay attention to the more obvious details, as here you can clearly see the dog/bicycle/car.

Train https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov3-tiny_3l.cfg on this small dataset for 10000 iterations: #3114 (comment)

---	default model	`[net] adversarial_lr=0.05`
1st try
2nd try

The text was updated successfully, but these errors were encountered:

AlexeyAB · 2020-03-26T14:30:07Z

@WongKinYiu You can try to train some small model f.e. yolov3-tiny-prn.cfg with [net] adversarial_lr=0.05 and compare the mAP.

WongKinYiu · 2020-03-26T14:58:04Z

@AlexeyAB OK, i will also add this into ablation study waiting list.

AlexeyAB · 2020-03-27T13:35:28Z

@WongKinYiu I improved Self-adversarial training in the latest code: 4f62a01

So use the latest code
and [net] adversarial_lr=1 in cfg-file

WongKinYiu · 2020-03-27T14:03:49Z

@AlexeyAB OK, i will retrain the model.

AlexeyAB · 2020-03-27T14:16:44Z

@WongKinYiu What model do you train?

WongKinYiu · 2020-03-27T14:29:05Z

@AlexeyAB yolov3-tiny-prn.

AlexeyAB · 2020-03-27T17:11:56Z

@WongKinYiu Sorry, another one fix ) 9a23447
Please run training again.

Is training of CSResNext/Darknet + PANet + MISH currently in progress?

WongKinYiu · 2020-03-27T23:08:06Z

@AlexeyAB
OK.
512x512: 42.3/64.2/45.8 - a little bit lower than w\o mish.

AlexeyAB · 2020-03-28T06:16:24Z

@WongKinYiu

512x512: 42.3/64.2/45.8 - a little bit lower than w\o mish.

Is it CSResNext-Panet or Darknet-PANet?
Is it trained by using top model from https://github.com/WongKinYiu/CrossStagePartialNetworks/blob/master/imagenet/results.md ?
Is it trained with higher subdivisions (lower mini-batch) than without mish?

WongKinYiu · 2020-03-28T06:58:51Z

@AlexeyAB

Model (all with optimal setting)	Size	AP	AP50	AP75
CSPResNeXt50-PANet-SPP	512×512	42.4	64.4	45.9
CSPResNeXt50-PANet-SPP (better imagenet)	512×512	42.3	64.3	45.7
CSPResNeXt50-PANet-SPP (better imagenet+mish)	512×512	42.3	64.2	45.8
CSPDarknet53-PANet-SPP (better imagenet)	512×512	42.4	64.5	46.0
CSPDarknet53-PANet-SPP (better imagenet+mish)	512×512	43.0	64.9	46.5

AlexeyAB · 2020-03-28T12:40:56Z

@WongKinYiu Thanks!

Is CSPResNeXt-50 better imagenet == CutMix + Mosaic + Label Smoothing = 78.5% / 94.8% ?
What is the result of CSPDarknet53-PANet-SPP without better imagenet?
Does it mean that CutMix + Mosaic + Label Smoothing and/or mish worsen the results of CSPResNeXt but improves the results of CSPDarknet53?

WongKinYiu · 2020-03-28T13:03:19Z

@AlexeyAB

yes.
i do not train it.
i think for cspresnext50, all of models get almost same results.
but for cspdarknet53, yes.

AlexeyAB · 2020-03-28T15:16:11Z

@WongKinYiu

So at the moment it is unclear whether such features as CBN, Dropblock and Adversarial-training improve accuracy:

It seems that CBN works fine for Detector, but doesn't work for Classifier.
Classifier with Dropblock was trained with broken weighted-[shortcut]-layers (without constrains/burnin_update/lrelu/softmax) so the results are incomprehensible
Adversarial-training is in progress

How long does it take to train yolov3-tiny-prn with Adversarial-training?
Currently added display of the remaining training time
You can try to train
- csdarknet53-omega-mi.cfg.txt = (better imagenet+mish) + weighted-[shortcut]-multi-input-softmax
- csdarknet53-omega-mi-db.cfg.txt = (better imagenet+mish) + weighted-[shortcut]-multi-input-softmax + dropblock (since weighted-[shortcut]-multi-input-softmax works very well with csresnext50, we will see if the dropblock works really well)
Then you can try to train CSPDarknet53-PANet-SPP (better imagenet+mish) + CBN + May be Adversarial-training with the best backbone of: csdarknet53-omega.cfg.txt / csdarknet53-omega-mi.cfg.txt / csdarknet53-omega-mi-db.cfg.txt
Does training of csresnext50morelayers-spp-asff-bifpn-rfb-db.cfg go well?

WongKinYiu · 2020-03-28T15:59:40Z

@AlexeyAB

85~90.
OK, i will get some free gpus after 1 week.
i will design experiments according to ablation studies.
by the way, due to the training process of darknet is really slow, i may develop new methods using pytorch, if it works, then move it to darknet.
currently 30k epochs.

AlexeyAB · 2020-03-28T22:23:35Z

@WongKinYiu

It's just that most Pytorch models have accuracy that is noticeably lower than in Darknet, based on these tables:

Pytorch - 42.0% AP / 61.9% AP - YOLOv3-SPP-ultralytics 512x512: https://github.com/WongKinYiu/CrossStagePartialNetworks/tree/pytorch and https://github.com/ultralytics/yolov3#map
Darknet - 43.0% AP / 64.9% AP50 - CSPDarknet53-PANet-SPP (better imagenet+mish) 512x512: https://github.com/WongKinYiu/CrossStagePartialNetworks/blob/master/coco/results.md and Self-adversarial training - data augmentation #5117 (comment)

I think these are the last 2 models that we can train on Darknet before reproduce them back on Pytorch:

Classifier - csdarknet53-omega-mi.cfg.txt
Detector - CSPDarknet53-omega-mi-PANet-SPP (better imagenet+mish) + CBN + May be Adversarial-training + without-iou_thresh

Then we can use Darknet just for low-level optimizations xnor-models/..., or for new-recurrent-layers (changed gru/lstm/... layers), ...

WongKinYiu · 2020-03-29T04:50:31Z

@AlexeyAB

I will modify some code of ultralytics and examine the performance.
Due to training time of Pytorch is about 1 week while Darknet takes more than 1 month, I think I can check if new features are suitable for our models faster.
For example, anchor-free based methods, instance segmentation, simultaneously detect and track...
Then move new features which perform well into Darknet.

OK, will do these experiments as soon as possible.

AlexeyAB · 2020-03-31T16:20:33Z

@WongKinYiu Hi,

Why did you use leaky instead of mish for the PANet-head of CSPDarknet53-PANet-SPP (better imagenet+mish) ? There is used mish for backbone, and leaky for PANet-head: #5117 (comment)

I think Mish can get better accuracy.

When you will train CSPDarknet53-PANet-SPP (better imagenet+mish) + CBN batch_normalize=2 + May be adversarial-training adversarial_lr=1 + May be label_smooth_eps=0.1 that is based on csdarknet53-omega-mi.cfg.txt , try to train with Mish-activation instead of Leak for PANet-head too. #5117 (comment)

WongKinYiu · 2020-03-31T16:52:14Z

OK, thanks.

AlexeyAB · 2020-03-31T18:14:13Z

@WongKinYiu Also I added iou_thresh_kind= parameter to the [yolo] and [Gaussian_yolo] layers.
Now you can use it without changing source code:

[yolo]
iou_thresh_kind=giou  # by default: iou
iou_thresh=0.213

AlexeyAB · 2020-04-06T14:05:52Z

@WongKinYiu Hi, Have you finished training yolov3-tiny-prn model with [net] adversarial_lr=1 and what result did you get?

WongKinYiu · 2020-04-06T14:20:39Z

@AlexeyAB

will finish training in 10 min, currently the ap50 of val data is 30.22%

update:
with adversarial: 29.83% val ap50.
without: 32.78% val ap50.

update:
finetune adversarial: 30.03 val ap50.

AlexeyAB · 2020-04-06T15:18:57Z

@WongKinYiu Thanks.

So Self-adversarial training decreases ~ -3% AP50, at least for the small model.

Have you checked the CBN again on the small model like Tiny-PRN?

AlexeyAB · 2020-04-06T15:19:31Z

@WongKinYiu Can you share cfg and weights file of yolov3-tiny-prn with Self-adversarial?

AlexeyAB · 2020-04-06T21:26:36Z

The model that is trained with Self-adversarial training (data augmentation) is more robustness for Self-Adversarial attack and requires much more image changes than default model:

adversarial-trained `yolov3-tiny-prn-adversarial.weights`	default `yolov3-tiny-prn.weights`

Click image to enlarge	Click image to enlarge

sisrfeng · 2020-04-29T12:57:42Z

ck and requires much more image change

I can see some noise on the left cat. Could you please explain the difference between these two cats? What does the not atttacked image look like?

AlexeyAB · 2020-04-29T13:20:44Z

Left - how much noise is required to trick a neural network that uses self-adversarial-training (you can tune hyperparameters, so it will require even much more noise)
Right - how much noise is required to trick a neural network that uses regular-training

A non-attacked image looks without any noise.

sisrfeng · 2020-04-29T13:35:43Z

Many thanks!
Since the left cat is under the default yolov3-tiny-prn.weights curve, I thought it as the one you explained as RIGHT.
On the right cat, the noise is mainly in this part , right?

On the non-attacked image, a cat rather than a person can be detected, right?

AlexeyAB · 2020-04-29T13:43:57Z

The captions for the drawings were mixed up, I fixed it)

sleepfin · 2020-05-15T03:21:09Z

@AlexeyAB Did you use adversarial_lr in yolov4 training ?
I can't find any source code or cfg related to adversarial_lr.
Can anyone help me.

dereyly · 2020-05-15T08:32:28Z

How can i visualise SAT while training?
Write images with predicted boxes in path

AlexeyAB · 2020-05-15T12:37:29Z

Set in cfg-file

[net]
adversarial_lr=1.0

Or un-comment this line:

darknet/src/network_kernels.cu

Line 373 in bef2844

//show_image(im, "adversarial data augmentation");
Or run training with -show_imgs flag at the end of training command.

JasonRuan5 · 2020-09-07T04:00:06Z

@AlexeyAB
What value of adversarial_lr should be set for yolov4-tiny and yolov4-custom? Is it data-set dependent or how the value will affect self adversarial train? Thanks!

yelantf · 2021-06-16T09:45:02Z

In the first forward-backward pass, will the network try to add objects when there is no objects in the image?

AlexeyAB · 2021-06-16T19:46:37Z

@yelantf
No, it will only try to remove objects.

So in general, it is a good idea - try to add objects to the image randomly (the number and size of objects - through configurable parameters).

KuoEuran · 2021-09-14T10:29:39Z

Hi all, so use this mechanism to train will improve the model's accuracy or some other advantages?
I am curious about it. tks

AlexeyAB added the enhancement label Mar 26, 2020

AlexeyAB changed the title ~~Self-adversarial training~~ Self-adversarial training - data augmentation Mar 27, 2020

AlexeyAB mentioned this issue Mar 30, 2020

Adversarial attack on Detector #5105

Open

AlexeyAB mentioned this issue Apr 1, 2020

Implemented weighted-multi_input-[shortcut] layer with weights-normalization #4662

Open

ly0303521 mentioned this issue May 11, 2020

How to use CmBN/self-adversarial mentioned in yolov4? #5377

Open

sealedtx mentioned this issue Jul 14, 2020

About data augmentation in [tiny] yolo #6130

Open

Hishok mentioned this issue Sep 21, 2020

Adversarial Training BDD100K #6705

Open

AlexeyAB mentioned this issue Jan 28, 2021

FEATURE REQUEST - GRAD CAM FOR YOLO #7306

Open

SaraPiresNobrega mentioned this issue Jun 29, 2021

GRAD CAM for test images #7845

Open

Self-adversarial training - data augmentation #5117

Self-adversarial training - data augmentation #5117

Comments

AlexeyAB commented Mar 26, 2020 • edited Loading

AlexeyAB commented Mar 26, 2020

WongKinYiu commented Mar 26, 2020

AlexeyAB commented Mar 27, 2020 • edited Loading

WongKinYiu commented Mar 27, 2020

AlexeyAB commented Mar 27, 2020

WongKinYiu commented Mar 27, 2020

AlexeyAB commented Mar 27, 2020

WongKinYiu commented Mar 27, 2020

AlexeyAB commented Mar 28, 2020

WongKinYiu commented Mar 28, 2020

AlexeyAB commented Mar 28, 2020

WongKinYiu commented Mar 28, 2020

AlexeyAB commented Mar 28, 2020

WongKinYiu commented Mar 28, 2020

AlexeyAB commented Mar 28, 2020

WongKinYiu commented Mar 29, 2020

AlexeyAB commented Mar 31, 2020

WongKinYiu commented Mar 31, 2020

AlexeyAB commented Mar 31, 2020

AlexeyAB commented Apr 6, 2020

WongKinYiu commented Apr 6, 2020 • edited Loading

AlexeyAB commented Apr 6, 2020

AlexeyAB commented Apr 6, 2020

AlexeyAB commented Apr 6, 2020 • edited Loading

sisrfeng commented Apr 29, 2020

AlexeyAB commented Apr 29, 2020

sisrfeng commented Apr 29, 2020

AlexeyAB commented Apr 29, 2020

sleepfin commented May 15, 2020 • edited Loading

dereyly commented May 15, 2020

AlexeyAB commented May 15, 2020

JasonRuan5 commented Sep 7, 2020

yelantf commented Jun 16, 2021

AlexeyAB commented Jun 16, 2021 • edited Loading

KuoEuran commented Sep 14, 2021 • edited Loading

AlexeyAB commented Mar 26, 2020 •

edited

Loading

AlexeyAB commented Mar 27, 2020 •

edited

Loading

WongKinYiu commented Apr 6, 2020 •

edited

Loading

AlexeyAB commented Apr 6, 2020 •

edited

Loading

sleepfin commented May 15, 2020 •

edited

Loading

AlexeyAB commented Jun 16, 2021 •

edited

Loading

KuoEuran commented Sep 14, 2021 •

edited

Loading