-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #3450 from BUAADreamer/mllm
Add Multimodal LLM Finetuning
- Loading branch information
Showing
13 changed files
with
230 additions
and
38 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
[ | ||
{ | ||
"messages": [ | ||
{ | ||
"content": "Who are they?<image>", | ||
"role": "user" | ||
}, | ||
{ | ||
"content": "They're Kane and Gretzka from Bayern Munich.", | ||
"role": "assistant" | ||
}, | ||
{ | ||
"content": "What are they doing?", | ||
"role": "user" | ||
}, | ||
{ | ||
"content": "They are celebrating on the soccer field", | ||
"role": "assistant" | ||
} | ||
], | ||
"images": [ | ||
"images/1.jpg" | ||
] | ||
}, | ||
{ | ||
"messages": [ | ||
{ | ||
"content": "Who is he?<image>", | ||
"role": "user" | ||
}, | ||
{ | ||
"content": "He's Thomas Muller from Bayern Munich.", | ||
"role": "assistant" | ||
}, | ||
{ | ||
"content": "Why is he on the ground?", | ||
"role": "user" | ||
}, | ||
{ | ||
"content": "Because he's sliding on his knees to celebrate.", | ||
"role": "assistant" | ||
} | ||
], | ||
"images": [ | ||
"images/2.jpg" | ||
] | ||
}, | ||
{ | ||
"messages": [ | ||
{ | ||
"content": "Please describe this image<image>", | ||
"role": "user" | ||
}, | ||
{ | ||
"content": "Chinese astronaut Gui Haichao is giving a speech.", | ||
"role": "assistant" | ||
}, | ||
{ | ||
"content": "What has he accomplished?", | ||
"role": "user" | ||
}, | ||
{ | ||
"content": "He was appointed to be a payload specialist on Shenzhou 16 mission in June 2022, thus becoming the first Chinese civilian of Group 3 in space on 30 May 2023. He is responsible for the on-orbit operation of space science experimental payloads.", | ||
"role": "assistant" | ||
} | ||
], | ||
"images": [ | ||
"images/3.jpg" | ||
] | ||
} | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
#!/bin/bash | ||
|
||
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \ | ||
--stage sft_mm \ | ||
--do_train \ | ||
--model_name_or_path llava-hf/llava-1.5-7b-hf \ | ||
--dataset mllm_instruct_example \ | ||
--dataset_dir data \ | ||
--template default \ | ||
--finetuning_type lora \ | ||
--lora_target all \ | ||
--output_dir saves/llava-1.5-7b/lora/sft \ | ||
--overwrite_cache \ | ||
--overwrite_output_dir \ | ||
--cutoff_len 1024 \ | ||
--preprocessing_num_workers 16 \ | ||
--per_device_train_batch_size 3 \ | ||
--per_device_eval_batch_size 1 \ | ||
--gradient_accumulation_steps 1 \ | ||
--lr_scheduler_type cosine \ | ||
--logging_steps 1 \ | ||
--warmup_steps 20 \ | ||
--save_steps 100 \ | ||
--eval_steps 100 \ | ||
--evaluation_strategy steps \ | ||
--load_best_model_at_end \ | ||
--learning_rate 5e-5 \ | ||
--num_train_epochs 100 \ | ||
--max_samples 3000 \ | ||
--val_size 0.1 \ | ||
--plot_loss \ | ||
--bf16 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.