Potential issue for continual forgetting #6
-
After executing the |
Beta Was this translation helpful? Give feedback.
Replies: 14 comments
-
Thanks for your attention. Because we have conducted many ablation studies, the hyperparameters may not be the ones we use in our Table. Which results do you want to reproduce? I can check the logs and give you some guidance. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your prompt response! export CUDA_VISIBLE_DEVICES=2
NUM_FIRST_CLS=80
PER_FORGET_CLS=$((100-$NUM_FIRST_CLS))
# GS-LoRA
for lr in 1e-2
do
for beta in 0.15
do
python3 -u train/train_own_forget_cl.py -b 48 -w 0 -d casia100 -n VIT -e 100 \
-head CosFace --outdir out_path/to/exps/CLGSLoRA/start${NUM_FIRST_CLS}forgetper${PER_FORGET_CLS}lr${lr}beta${beta} \
--warmup-epochs 0 --lr $lr --num_workers 8 --lora_rank 8 --decay-epochs 100 \
--vit_depth 6 --num_of_first_cls $NUM_FIRST_CLS --per_forget_cls $PER_FORGET_CLS \
-r results/ViT-P8S8_casia100_cosface_s1-1200-150de-depth6/Backbone_VIT_Epoch_1110_Batch_82100_Time_2023-10-18-18-22_checkpoint.pth \
--BND 110 --beta $beta --alpha 0.01 --min-lr 1e-5 --num_tasks 4 --wandb_group forget_cl_new \
--cl_beta_list 0.15 0.15 0.15 0.15
done
done Without altering any other code, I wasn't able to achieve the results displayed in the table (all 74%). Upon scrutinizing all the code, I discovered that there is no code for sparse warmup in train_own_forget_cl.py, unlike in train_own_forget.py where it is present. Subsequently, I copied the relevant code snippets from train_own_forget.py to train_own_forget_cl.py and appended |
Beta Was this translation helpful? Give feedback.
-
I also have same problem... |
Beta Was this translation helpful? Give feedback.
-
Thanks for your attention. I checked our experimental log in wandb and found this were the parameters we used to get the results in the paper.
Here is our log. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your continued responses! |
Beta Was this translation helpful? Give feedback.
-
By the way, have you finished the training process? It is normal all the metrics (accf, accr, acco) are consistently at 74% at first. I will check the code later. |
Beta Was this translation helpful? Give feedback.
-
I haven't finished yet, but currently, I'm at:
This output seems to be abnormal. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
In fact I'm wondering that the code in the Github may be different from the one you use. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I use the code from Github and can get a similar result like yesterday. If you can not reimplement the results yet, I recommend you to use the two strategies or increasing lr slightly. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your continuous reply, I implemented some code myself yesterday and I have got correct result! |
Beta Was this translation helpful? Give feedback.
-
Based on your question, we plan to refind some hyperparameters to get a better selection and help others reimplement the results more easily. |
Beta Was this translation helpful? Give feedback.
Thanks for your attention.
I checked our experimental log in wandb and found this were the parameters we used to get the results in the paper.