fix double bos for vision model #840

wukaixingxp · 2025-01-14T01:15:36Z

What does this PR do?

This PR fix double BOS token issue for vision model, as the BOS token is already included in the chat template and when the processor adds the BOS token again. This happened for both inference and fine-tuning.

Fixes Issue #826

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Inference Test

python recipes/quickstart/inference/local_inference/multi_modal_infer.py     --image_path ~/work/dog.jpg     --prompt_text "Describe this image"     --model_name "meta-llama/Llama-3.2-11B-Vision-Instruct"
Loading model: meta-llama/Llama-3.2-11B-Vision-Instruct
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:04<00:00,  1.08it/s]
Input Prompt:
 <|begin_of_text|><|start_header_id|>user<|end_header_id|>

<|image|>Describe this image<|eot_id|><|start_header_id|>assistant<|end_header_id|>


Generated Text: This image features a small dog, likely a puppy, standing on a skateboard in the middle of a road or street. The dog's coat is predominantly brown and white, with distinctive black markings on its back and legs. Its floppy ears and dark eyes are prominent features.

The skateboard, which the dog is standing on, is black with red wheels. In the background, a blue door is visible, although out of focus. The overall atmosphere of the image suggests that the dog is being showcased in a humorous or playful manner, possibly as part of a meme or joke.<|eot_id|>

Finetuning test

~/work/llama-recipes (fix_double_bos)]$ torchrun --nnodes 1 --nproc_per_node 4  recipes/quickstart/finetuning/finetuning.py --enable_fsdp --lr 1e-5  --num_epochs 3 --batch_size_training 2 --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --dist_checkpoint_root_folder ./finetuned_model --dist_checkpoint_folder fine-tuned  --use_fast_kernels --dataset "custom_dataset" --custom_dataset.test_split "test" --custom_dataset.file "recipes/quickstart/finetuning/datasets/ocrvqa_dataset.py"  --run_validation True --batching_strategy padding  --use_peft --peft_method lora
W0113 17:47:42.792000 140312926217216 torch/distributed/run.py:757]
W0113 17:47:42.792000 140312926217216 torch/distributed/run.py:757] *****************************************
W0113 17:47:42.792000 140312926217216 torch/distributed/run.py:757] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W0113 17:47:42.792000 140312926217216 torch/distributed/run.py:757] *****************************************
Clearing GPU cache for all ranks
--> Running with torch dist debug set to detail
Loading checkpoint shards: 100%|████████████| 5/5 [00:00<00:00,  7.62it/s]
Loading checkpoint shards: 100%|████████████| 5/5 [00:00<00:00,  7.47it/s]
Loading checkpoint shards: 100%|████████████| 5/5 [00:00<00:00,  6.76it/s]
Loading checkpoint shards: 100%|████████████| 5/5 [00:01<00:00,  4.91it/s]
--> Model meta-llama/Llama-3.2-11B-Vision-Instruct

--> meta-llama/Llama-3.2-11B-Vision-Instruct has 10670.220835 Million params

trainable params: 5,898,240 || all params: 10,676,119,075 || trainable%: 0.0552470420998934
bFloat16 enabled for mixed precision - using bfSixteen policy
trainable params: 5,898,240 || all params: 10,676,119,075 || trainable%: 0.0552470420998934
trainable params: 5,898,240 || all params: 10,676,119,075 || trainable%: 0.0552470420998934
trainable params: 5,898,240 || all params: 10,676,119,075 || trainable%: 0.0552470420998934
--> applying fsdp activation checkpointing...
--> applying fsdp activation checkpointing...
--> applying fsdp activation checkpointing...
--> applying fsdp activation checkpointing...
--> Training Set Length = 1800
--> Validation Set Length = 200
length of dataset_train 1800
custom_data_collator is used
--> Num of Training Set Batches loaded = 225
length of dataset_train 1800
custom_data_collator is used
--> Num of Training Set Batches loaded = 225
length of dataset_train 1800
custom_data_collator is used
--> Num of Training Set Batches loaded = 225
length of dataset_train 1800
custom_data_collator is used
--> Num of Training Set Batches loaded = 225
--> Num of Validation Set Batches loaded = 50
--> Num of Validation Set Batches loaded = 50
Starting epoch 0/3
train_config.max_train_step: 0
--> Num of Validation Set Batches loaded = 50
--> Num of Validation Set Batches loaded = 50
Starting epoch 0/3
train_config.max_train_step: 0
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/cuda/memory.py:330: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
Training Epoch: 1:   0%|                          | 0/225 [00:00<?, ?it/s]--> Num of Validation Set Batches loaded = 50
--> Num of Validation Set Batches loaded = 50
Starting epoch 0/3
train_config.max_train_step: 0
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/cuda/memory.py:330: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
Training Epoch: 1:   0%|                          | 0/225 [00:00<?, ?it/s]--> Num of Validation Set Batches loaded = 50
--> Num of Validation Set Batches loaded = 50
Starting epoch 0/3
train_config.max_train_step: 0
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/cuda/memory.py:330: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
Training Epoch: 1:   0%|                          | 0/225 [00:00<?, ?it/s]/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/cuda/memory.py:330: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
Training Epoch: 1:   0%|                          | 0/225 [00:00<?, ?it/s]NCCL version 2.20.5+cuda12.4
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training Epoch: 1/3, step 13/225 completed (loss: 0.854179859161377):

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

fix double bos for vision model

ff3df5b

facebook-github-bot added the cla signed label Jan 14, 2025

remove double BOS manually during fine-tuning

30bc536

wukaixingxp marked this pull request as ready for review January 14, 2025 18:32

wukaixingxp requested a review from init27 January 14, 2025 18:40

init27 approved these changes Jan 14, 2025

View reviewed changes

init27 merged commit 9c3964e into main Jan 14, 2025
3 of 4 checks passed

init27 deleted the fix_double_bos branch January 14, 2025 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix double bos for vision model #840

fix double bos for vision model #840

wukaixingxp commented Jan 14, 2025 •

edited

Loading

fix double bos for vision model #840

fix double bos for vision model #840

Conversation

wukaixingxp commented Jan 14, 2025 • edited Loading

What does this PR do?

Feature/Issue validation/testing

Before submitting

wukaixingxp commented Jan 14, 2025 •

edited

Loading