Hi, I use the following configuration for training, but when training epoch21, the accuracy has been 60.00 to 60.01 does not rise #257

leo23ui · 2025-01-13T04:13:23Z

Hi, I use the following configuration for training, but when training epoch21, the accuracy has been 60.00 to 60.01 does not rise， Is this because I did not use 8*512 batchsize for training? could you please give me some advice， thanks!!!!

export NNODES=1
export GPUS_PER_NODE=2
export WANDB__SERVICE_WAIT=60
export CUDA_VISIBLE_DEVICES=4,5

DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES "
torchrun $DISTRIBUTED_ARGS src/training/main2.py
--save-frequency 1
--report-to wandb
--train-data /home/gg/gg/MQBench-main/test/model/e1/split_2tar
--dataset-type webdataset
--imagenet-val ./ImageNet
--warmup 2000
--batch-size 2048
--epochs 25
--workers 16
--model TinyCLIP-ViT-39M-16-Text-19M
--name exp_name2
--seed 0
--local-loss
--grad-checkpointing
--output ./outputs/c
--lr 0.0001
--gather-with-grad
--pretrained-image-file ViT-B-16@openai
--pretrained-text-file ViT-B-16@openai
--distillation-teacher ViT-B-32@laion2b_e16
--norm_gradient_clip 5
--train-num-samples 15000000
--logit-scale 50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hi, I use the following configuration for training, but when training epoch21, the accuracy has been 60.00 to 60.01 does not rise #257

Hi, I use the following configuration for training, but when training epoch21, the accuracy has been 60.00 to 60.01 does not rise #257

leo23ui commented Jan 13, 2025

Hi, I use the following configuration for training, but when training epoch21, the accuracy has been 60.00 to 60.01 does not rise #257

Hi, I use the following configuration for training, but when training epoch21, the accuracy has been 60.00 to 60.01 does not rise #257

Comments

leo23ui commented Jan 13, 2025