Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuning Script #6

Open
TechxGenus opened this issue Mar 12, 2024 · 7 comments
Open

Fine-tuning Script #6

TechxGenus opened this issue Mar 12, 2024 · 7 comments

Comments

@TechxGenus
Copy link

Congratulations to DeepSeek for the wonderful work. I wonder if there is a script for fine-tuning DeepSeek-VL? Thanks!

@RERV
Copy link
Collaborator

RERV commented Mar 12, 2024

Hi, thank you for your interest. We are currently busy iterating DeepSeek-VL. The community has already started supporting DeepSeek-VL (#10 ). Have fun!
https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/deepseek-vl%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md

@SinanAkkoyun
Copy link
Contributor

SinanAkkoyun commented Mar 12, 2024

@RERV It seems as if swift does not support finetuning of the vision encoder (it seems that way from my quick glance over the source code, I hope I'm wrong)

Given that you are internally training deepseek VL somehow, could you provide training code snippets so that the community can work on an LLM and vision encoder finetuning script?

@soloice
Copy link

soloice commented Mar 13, 2024

@RERV It seems as if swift does not support finetuning of the vision encoder (it seems that way from my quick glance over the source code, I hope I'm wrong)

Given that you are internally training deepseek VL somehow, could you provide training code snippets so that the community can work on an LLM and vision encoder finetuning script?

Internally we train DeepSeek-VL with hai-llm (as mentioned in the paper), which is a closed source training framework. We do hope to open source hai-llm someday, but that is a really big project, involving our training cluster configuration/management and other internal libraries. I'm afraid that we don't have any bandwidth working on cleaning up & open sourcing hai-llm core code right now.

@SinanAkkoyun
Copy link
Contributor

@soloice Hi, I see, thanks. Would it be possible to just release the backprop code of the vision encoder, no framework around it, no clustering, just a starting point for the community to work upon?

@soloice
Copy link

soloice commented Mar 13, 2024

@soloice Hi, I see, thanks. Would it be possible to just release the backprop code of the vision encoder, no framework around it, no clustering, just a starting point for the community to work upon?

Well, I can describe how to do this briefly. Basically you don't need to write backprop code, because torch will take care of everything. Just build the model, then set the requires_grad attribute in visual encoder will work:

for p in visual_encoder.parameters():
    p.requires_grad = True

What you really need to care about is distributed strategy. If you are using DDP or 3D parallel with TP=1, the above code is all you need; If you are using 3D parallel with TP>1, you will need to average the gradient of visual encoders on all tp ranks with an NCCL call looks like dist.all_reduce(p.grad, group=tensor_parallel_group) for parameters in visual encoder to make sure all tp ranks having the same gradient.

@SinanAkkoyun
Copy link
Contributor

SinanAkkoyun commented Mar 13, 2024

@soloice Thank you very much for the information!

Given the PyTorch grad, how would you go about training? In our use-case we need to add a bit of grounding by implementing a cursor as an output.
How would that look like high- and low level? Would training the whole LLM + vision encoder work like that, just providing cursor tokens in the dataset and it would do the rest? Or does one have to train the vision encoder separately? Thank you for helping out

@SinanAkkoyun
Copy link
Contributor

SinanAkkoyun commented Mar 13, 2024

modelscope/ms-swift#543

Jintao-Huang implemented it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants