After fine-tuning, the model outputs repetitive phrases #89

Jackyzjz · 2024-09-11T11:55:08Z

Thanks for your good job。

I am trying to fine-tune the videollama2 model with my own data. However, after fine-tuning, the model starts to repeatedly output the same content. Could you help me solve this issue?

thisurawz1 · 2024-09-13T03:12:49Z

Can you share the inference script that you used to do the inference with fine-tuned LoRA weights?

Jackyzjz · 2024-09-20T06:22:05Z

I am performing LoRA fine-tuning based on videollama2-7b, and the script is as follows:

#!/bin/bash
export NCCL_P2P_DISABLE="1"
export NCCL_IB_DISABLE="1"

Environment Variables

ARG_WORLD_SIZE=${1:-1}
ARG_NPROC_PER_NODE=${2:-8}
ARG_MASTER_ADDR="127.0.0.1"
ARG_MASTER_PORT=16666
ARG_RANK=0

Multiple conditions

if [ ! -n "$WORLD_SIZE" ] || [ ! -n "$NPROC_PER_NODE" ]; then
WORLD_SIZE=$ARG_WORLD_SIZE
NPROC_PER_NODE=$ARG_NPROC_PER_NODE
fi
if [ ! -n "$MASTER_ADDR" ] || [ ! -n "$MASTER_PORT" ] || [ ! -n "$RANK" ]; then
MASTER_ADDR=$ARG_MASTER_ADDR
MASTER_PORT=$ARG_MASTER_PORT
RANK=$ARG_RANK
fi

echo "WORLD_SIZE: $WORLD_SIZE"
echo "NPROC_PER_NODE: $NPROC_PER_NODE"

Training Arguments

GLOBAL_BATCH_SIZE=8
LOCAL_BATCH_SIZE=1
GRADIENT_ACCUMULATION_STEPS=$[$GLOBAL_BATCH_SIZE/($WORLD_SIZE*$NPROC_PER_NODE*$LOCAL_BATCH_SIZE)]

Log Arguments

export TRANSFORMERS_OFFLINE=1
export WANDB_PROJECT=videollama2
RUN_NAME=new_dataset_lora
DATA_DIR=datasets
OUTP_DIR=/ssd/jacky
torchrun --nnodes $WORLD_SIZE
--nproc_per_node $NPROC_PER_NODE
--master_addr=$MASTER_ADDR
--master_port=$MASTER_PORT
--node_rank $RANK
videollama2/train_flash_attn.py
--lora_enable True --lora_r 128 --lora_alpha 256 --mm_projector_lr 2e-5
--deepspeed scripts/zero2.json
--model_type videollama2
--model_path /ssd/jacky/VideoLLaMA2-7B
--vision_tower /ssd/jacky/clip-vit-large-patch14-336
--mm_projector_type stc_connector
--data_path ${DATA_DIR}/videollava_sft/image_train.json
--data_folder ${DATA_DIR}/videollava_sft/
--mm_vision_select_layer -2
--num_frames 8
--bf16 True
--tf32 True
--fp16 False
--output_dir ${OUTP_DIR}/finetune_${RUN_NAME}
--num_train_epochs 5
--per_device_train_batch_size $LOCAL_BATCH_SIZE
--per_device_eval_batch_size 4
--gradient_accumulation_steps $GRADIENT_ACCUMULATION_STEPS
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 375
--save_total_limit 99
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--model_max_length 2048
--gradient_checkpointing True
--dataloader_num_workers 4
--report_to tensorboard
--run_name $RUN_NAME \

LiangMeng89 · 2024-11-13T18:13:44Z

Thanks for your good job。

I am trying to fine-tune the videollama2 model with my own data. However, after fine-tuning, the model starts to repeatedly output the same content. Could you help me solve this issue?

I also have this problem,do you solve it?

LiangMeng89 · 2024-11-13T18:14:02Z

Thanks for your good job。

I am trying to fine-tune the videollama2 model with my own data. However, after fine-tuning, the model starts to repeatedly output the same content. Could you help me solve this issue?

Hello,I'm a phD student from ZJU, I also use videollama2 to do my own research,we create a WeChat group to discuss some issues of videollama2 and help each other,could you join us? Please contact me: WeChat number == LiangMeng19357260600, phone number == +86 19357260600,e-mail == [email protected].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After fine-tuning, the model outputs repetitive phrases #89

After fine-tuning, the model outputs repetitive phrases #89

Jackyzjz commented Sep 11, 2024

thisurawz1 commented Sep 13, 2024

Jackyzjz commented Sep 20, 2024

LiangMeng89 commented Nov 13, 2024

LiangMeng89 commented Nov 13, 2024

After fine-tuning, the model outputs repetitive phrases #89

After fine-tuning, the model outputs repetitive phrases #89

Comments

Jackyzjz commented Sep 11, 2024

thisurawz1 commented Sep 13, 2024

Jackyzjz commented Sep 20, 2024

Environment Variables

Multiple conditions

Training Arguments

Log Arguments

LiangMeng89 commented Nov 13, 2024

LiangMeng89 commented Nov 13, 2024