Potential issue for continual forgetting #6

ChengZe2005 · 2024-07-18T03:31:56Z

ChengZe2005
Jul 18, 2024

After executing the run_cl_forget.sh script for GSlora, I noticed that the desired results were not achieved. Upon closer inspection, I have identified a potential issue within the train/train_own_forget_cl.py script. Specifically, I observed that there might be a lack of implementation for the sparsity warm-up phase, which could potentially be contributing to the issue. It would be advisable to review and possibly add the necessary code for this warm-up process to ensure optimal performance.
But even after I added the warm-up process, proper results couldn't be achieved either, there might be additional factors at play. I hope that you could take a closer look into these matters with care.

Answered by bjzhb666

Jul 19, 2024

Thanks for your attention.

I checked our experimental log in wandb and found this were the parameters we used to get the results in the paper.

python train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 -head CosFace --outdir /data1/zhaohongbo/exps/draw-forget-CL/CLGSLoRA/start80forgetper20lr1e-2beta0.15 --warmup-epochs 0 --lr 1e-2 --num_workers 8 --lora_rank 8 --decay-epochs 100 --vit_depth 6 --num_of_first_cls 80 --per_forget_cls 20 -r /data/zhaohongbo/Github/amnesic-face-recognition/Face-Transformer/results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6-new/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth --BND 105 --beta 0.15 --alpha 0.0001 --min-lr 1e-5…

View full answer

bjzhb666 · 2024-07-18T09:40:52Z

bjzhb666
Jul 18, 2024
Maintainer

Thanks for your attention. Because we have conducted many ablation studies, the hyperparameters may not be the ones we use in our Table. Which results do you want to reproduce? I can check the logs and give you some guidance.

0 replies

ChengZe2005 · 2024-07-18T13:06:17Z

ChengZe2005
Jul 18, 2024
Author

Thanks for your attention. Because we have conducted many ablation studies, the hyperparameters may not be the ones we use in our Table. Which results do you want to reproduce? I can check the logs and give you some guidance.

Thank you for your prompt response!
I want to reproduce the results in Table 3 for GSlora, using the script below:

export CUDA_VISIBLE_DEVICES=2
NUM_FIRST_CLS=80
PER_FORGET_CLS=$((100-$NUM_FIRST_CLS))

 # GS-LoRA
 for lr in 1e-2
 do
 for beta in 0.15 
 do
 python3 -u train/train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 \
     -head CosFace --outdir out_path/to/exps/CLGSLoRA/start${NUM_FIRST_CLS}forgetper${PER_FORGET_CLS}lr${lr}beta${beta} \
     --warmup-epochs 0 --lr $lr --num_workers 8  --lora_rank 8 --decay-epochs 100 \
     --vit_depth 6 --num_of_first_cls $NUM_FIRST_CLS --per_forget_cls $PER_FORGET_CLS \
     -r results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth \
     --BND 110 --beta $beta --alpha 0.01 --min-lr 1e-5 --num_tasks 4 --wandb_group forget_cl_new \
     --cl_beta_list 0.15 0.15 0.15 0.15 
 done
 done

Without altering any other code, I wasn't able to achieve the results displayed in the table (all 74%). Upon scrutinizing all the code, I discovered that there is no code for sparse warmup in train_own_forget_cl.py, unlike in train_own_forget.py where it is present. Subsequently, I copied the relevant code snippets from train_own_forget.py to train_own_forget_cl.py and appended --warmup_alpha --big_alpha 0.01 to the end of the script. This correction yielded the correct outcome for TASK0, however, TASK1, 2, and 3 continue to produce incorrect results. What could be the issue?

0 replies

TY-LEE-KR · 2024-07-18T17:41:47Z

TY-LEE-KR
Jul 18, 2024

I also have same problem...

0 replies

bjzhb666 · 2024-07-19T07:19:53Z

bjzhb666
Jul 19, 2024
Maintainer

Thanks for your attention.

I checked our experimental log in wandb and found this were the parameters we used to get the results in the paper.

python train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 -head CosFace --outdir /data1/zhaohongbo/exps/draw-forget-CL/CLGSLoRA/start80forgetper20lr1e-2beta0.15 --warmup-epochs 0 --lr 1e-2 --num_workers 8 --lora_rank 8 --decay-epochs 100 --vit_depth 6 --num_of_first_cls 80 --per_forget_cls 20 -r /data/zhaohongbo/Github/amnesic-face-recognition/Face-Transformer/results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6-new/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth --BND 105 --beta 0.15 --alpha 0.0001 --min-lr 1e-5 --num_tasks 4 --wandb_group forget_cl_new --cl_beta_list 0.2 0.25 0.25 0.25

Here is our log.
output.log

0 replies

ChengZe2005 · 2024-07-19T10:46:51Z

ChengZe2005
Jul 19, 2024
Author

Thanks for your attention.

I checked our experimental log in wandb and found this were the parameters we used to get the results in the paper.

python train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 -head CosFace --outdir /data1/zhaohongbo/exps/draw-forget-CL/CLGSLoRA/start80forgetper20lr1e-2beta0.15 --warmup-epochs 0 --lr 1e-2 --num_workers 8 --lora_rank 8 --decay-epochs 100 --vit_depth 6 --num_of_first_cls 80 --per_forget_cls 20 -r /data/zhaohongbo/Github/amnesic-face-recognition/Face-Transformer/results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6-new/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth --BND 105 --beta 0.15 --alpha 0.0001 --min-lr 1e-5 --num_tasks 4 --wandb_group forget_cl_new --cl_beta_list 0.2 0.25 0.25 0.25

Here is our log. output.log

Thank you for your continued responses!
I tried the bash script you provided, but it seems that the results for all the metrics (accf, accr, acco) are consistently at 74%. Maybe there's something wrong with the code on GitHub. Would you mind testing the code there to see if it works properly?

0 replies

bjzhb666 · 2024-07-19T13:42:15Z

bjzhb666
Jul 19, 2024
Maintainer

By the way, have you finished the training process? It is normal all the metrics (accf, accr, acco) are consistently at 74% at first. I will check the code later.

0 replies

ChengZe2005 · 2024-07-19T13:47:22Z

ChengZe2005
Jul 19, 2024
Author

I haven't finished yet, but currently, I'm at:

Task 0 Epoch 91 Batch 5450: 
- Training forget Loss: 21.0000 (21.0000)
- Training remain Loss: 0.0000 (0.0000)    
- Training structure Loss: 0.0000 (0.0000)   
- Training total Loss: 21.0000 (21.0000)   
- Training forget Prec@1: 100.000 (100.000)        
- Training remain Prec@1: 100.000 (100.000)

Current learning rate: 0.0002545

Performing evaluation on the test set and saving checkpoints...

Test forget-0 Accuracy: 74.652956%
Test remain-0 Accuracy: 74.499165%

This output seems to be abnormal.

0 replies

bjzhb666 · 2024-07-19T14:20:49Z

bjzhb666
Jul 19, 2024
Maintainer

I have trained it just now and I find that I can get the decreasing trend of forget acc. i.e., the model can escape the local minima.

I guess the reason comes from different machines. I am using RTX3090, different machines or cuda version or other reasons (maybe python version?) can get different results even using the same seed. e.g. The random number is different when using different machines.

I recommend you to increase $\beta$ or add warmup_alpha --big_alpha 0.01 strategies to get a reasonable result.

Actually, we conducted three times and got the same results when we wrote the results on our paper.

0 replies

ChengZe2005 · 2024-07-19T14:39:32Z

ChengZe2005
Jul 19, 2024
Author

I'm sorry but I followed your instruction but got this result:

Where did I go wrong?
Below is the script I use:

python train/train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 -head CosFace --outdir /data1/zhaohongbo/exps/draw-forget-CL/CLGSLoRA/start80forgetper20lr1e-2beta0.15 --warmup-epochs 0 --lr 1e-2 --num_workers 8 --lora_rank 8 --decay-epochs 100 --vit_depth 6 --num_of_first_cls 80 --per_forget_cls 20 -r /data/zhaohongbo/Github/amnesic-face-recognition/Face-Transformer/results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6-new/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth --BND 105 --beta 0.15 --alpha 0.0001 --min-lr 1e-5 --num_tasks 4 --wandb_group forget_cl_new --cl_beta_list 0.2 0.25 0.25 0.25 --warmup_alpha --big_alpha 0.01

0 replies

ChengZe2005 · 2024-07-19T14:47:43Z

ChengZe2005
Jul 19, 2024
Author

In fact I'm wondering that the code in the Github may be different from the one you use.

0 replies

bjzhb666 · 2024-07-20T06:50:09Z

bjzhb666
Jul 20, 2024
Maintainer

I'm sorry but I followed your instruction but got this result:

Where did I go wrong? Below is the script I use:

python train/train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 -head CosFace --outdir /data1/zhaohongbo/exps/draw-forget-CL/CLGSLoRA/start80forgetper20lr1e-2beta0.15 --warmup-epochs 0 --lr 1e-2 --num_workers 8 --lora_rank 8 --decay-epochs 100 --vit_depth 6 --num_of_first_cls 80 --per_forget_cls 20 -r /data/zhaohongbo/Github/amnesic-face-recognition/Face-Transformer/results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6-new/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth --BND 105 --beta 0.15 --alpha 0.0001 --min-lr 1e-5 --num_tasks 4 --wandb_group forget_cl_new --cl_beta_list 0.2 0.25 0.25 0.25 --warmup_alpha --big_alpha 0.01

I mean use these strategies. (increase $\beta$ or add --warmup_alpha --big_alpha 0.01). Maybe you need to implement some code yourself as you have already done.

0 replies

bjzhb666 · 2024-07-20T07:39:25Z

bjzhb666
Jul 20, 2024
Maintainer

I use the code from Github and can get a similar result like yesterday. If you can not reimplement the results yet, I recommend you to use the two strategies or increasing lr slightly.

0 replies

ChengZe2005 · 2024-07-20T07:42:54Z

ChengZe2005
Jul 20, 2024
Author

Thanks for your continuous reply, I implemented some code myself yesterday and I have got correct result!

0 replies

bjzhb666 · 2024-07-20T07:46:44Z

bjzhb666
Jul 20, 2024
Maintainer

Based on your question, we plan to refind some hyperparameters to get a better selection and help others reimplement the results more easily.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential issue for continual forgetting #6

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 14 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Potential issue for continual forgetting #6

ChengZe2005 Jul 18, 2024

Replies: 14 comments

bjzhb666 Jul 18, 2024 Maintainer

ChengZe2005 Jul 18, 2024 Author

TY-LEE-KR Jul 18, 2024

bjzhb666 Jul 19, 2024 Maintainer

ChengZe2005 Jul 19, 2024 Author

bjzhb666 Jul 19, 2024 Maintainer

ChengZe2005 Jul 19, 2024 Author

bjzhb666 Jul 19, 2024 Maintainer

ChengZe2005 Jul 19, 2024 Author

ChengZe2005 Jul 19, 2024 Author

bjzhb666 Jul 20, 2024 Maintainer

bjzhb666 Jul 20, 2024 Maintainer

ChengZe2005 Jul 20, 2024 Author

bjzhb666 Jul 20, 2024 Maintainer

ChengZe2005
Jul 18, 2024

bjzhb666
Jul 18, 2024
Maintainer

ChengZe2005
Jul 18, 2024
Author

TY-LEE-KR
Jul 18, 2024

bjzhb666
Jul 19, 2024
Maintainer

ChengZe2005
Jul 19, 2024
Author

bjzhb666
Jul 19, 2024
Maintainer

ChengZe2005
Jul 19, 2024
Author

bjzhb666
Jul 19, 2024
Maintainer

ChengZe2005
Jul 19, 2024
Author

ChengZe2005
Jul 19, 2024
Author

bjzhb666
Jul 20, 2024
Maintainer

bjzhb666
Jul 20, 2024
Maintainer

ChengZe2005
Jul 20, 2024
Author

bjzhb666
Jul 20, 2024
Maintainer