Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU/QLoRA-FineTuning #9406

Open
ernleite opened this issue Nov 9, 2023 · 17 comments
Open

CPU/QLoRA-FineTuning #9406

ernleite opened this issue Nov 9, 2023 · 17 comments
Assignees

Comments

@ernleite
Copy link

ernleite commented Nov 9, 2023

Hello
I am trying to fine tune a LLama2 model
Capture d'écran 2023-11-09 082020

Actually the finetuning process is taking a very long time so I had to cancel it because it is using only one core in my machine (DELL R730 with 2 CPUS / 56 Logicals cores)
I tried accelerate config but it is not working
Any idea?
Thanks !!

@jason-dai
Copy link
Contributor

Do we need to source bigdl-llm-init for QLoRA? @qiyuangong @hzjane

@hzjane
Copy link
Contributor

hzjane commented Nov 10, 2023

Do we need to source bigdl-llm-init for QLoRA? @qiyuangong @hzjane

I think it's ok, I'll add it to the readme file.

@hzjane
Copy link
Contributor

hzjane commented Nov 10, 2023

Hello I am trying to fine tune a LLama2 model Capture d'écran 2023-11-09 082020

Actually the finetuning process is taking a very long time so I had to cancel it because it is using only one core in my machine (DELL R730 with 2 CPUS / 56 Logicals cores) I tried accelerate config but it is not working Any idea? Thanks !!

Maybe you can try source bigdl-llm-init or just try taskset -c 0-27 to use more cores.

@ernleite
Copy link
Author

ernleite commented Nov 10, 2023

thanks for your reply
I already did that..
It works but when it starts "converting the current model to sym_int4 format then all disapeer. Only one process remains.
Is my server R730 compatible?

in fact python command never worked for me. it only works using llm-convert, llm-cli etc.
very strange
thanks

@hzjane
Copy link
Contributor

hzjane commented Nov 10, 2023

@ernleite
Copy link
Author

I followed the configuration since the beggining.
Thanks

@glorysdj
Copy link
Contributor

glorysdj commented Nov 10, 2023

thanks for your reply I already did that.. It works but when it starts "converting the current model to sym_int4 format then all disapeer. Only one process remains. Is my server R730 compatible?

in fact python command never worked for me. it only works using llm-convert, llm-cli etc. very strange thanks

Hi @ernleite, what do you mean python command never work? Have you tried taskset -c 0-27 to use more cores? Could you please share the commands for how you run this qlora fine tuning, we will try to check and reproduce it.

@ernleite
Copy link
Author

ernleite commented Nov 10, 2023

@glorysdj
I meant all the command like c 0-X python ./generate.py or qlora_finetuning_cpu.py does not work for me.
The only commands that work (using all cores in my machine) are llm-convert or llm-cli

my configuration
DELL R730 with 2 CPUs
96 GB RAM
Ubuntu 22.04 LTS

I would be so happy if this can work

here an unresolved issue I explained few weeks ago : [https://github.com//issues/8936]

thanks !

@ernleite
Copy link
Author

ernleite commented Nov 11, 2023

This screenshot showing only one core is used at a given time (100%)
image

@jason-dai
Copy link
Contributor

@glorysdj I meant all the command like c 0-X python ./generate.py or qlora_finetuning_cpu.py does not work for me. The only commands that work (using all cores in my machine) are llm-convert or llm-cli

my configuration DELL R730 with 2 CPUs 96 GB RAM Ubuntu 22.04 LTS

I would be so happy if this can work

here an unresolved issue I explained few weeks ago : [https://github.com//issues/8936]

thanks !

@ernleite - a quick question: are you able to run bigdl-llm using these python commands on your local PC (either windows or linux)?

@ernleite
Copy link
Author

ernleite commented Nov 11, 2023

I have a laptop running on Windows 11. Let me try. I will let you know.

@ernleite
Copy link
Author

ernleite commented Nov 12, 2023

@jason-dai I used my laptop
The CPU version works fine with Windows 11 (even it took several hours). Good step then!
image

I have two GPUs in my laptop but was not able to use my Intel Iris Xe with 16GB
I have an issue with the pytorch librairy
image

I tried many configurations but the Qlora GPU version does not work. Are we sure it works with python 3.9?
The DLL is present but seems to not work. I don't know why? I installed the latest Intel GPU drivers & OneAPi too.

So my question is : does the GPU version works with Windows ?
What is the equivalent in Windows for souce bigdl-init

thanks again

@Jasonzzt
Copy link
Contributor

Jasonzzt commented Nov 13, 2023

This screenshot showing only one core is used at a given time (100%) image

@ernleite Do you have a GPU on your machine? I tried to reproduce the issue and found that after converting the current model to sym_int4 format the finetuning program ran on the GPU.

So you can try to disable GPU when you finetune on CPU, and make sure you use the CPU version of package bigdl-llm.

Hope this can help you.

@jason-dai
Copy link
Contributor

jason-dai commented Nov 13, 2023

So my question is : does the GPU version works with Windows ?

Currently it's not supported yet

@liang1wang
Copy link

In my side, blocked at the process 0%(>3h) with MTL RVP when running qlora_finetuning_cpu.py,
cmd: python ./qlora_finetuning_cpu.py --repo-id-or-model-path llama-2-7b-hf --dataset english_quotes
env: MTL RVP, 8(e)+6(p) core, 96G mem, ubuntu22.04
I have run "source bigdl-llm-init -t"
Could you also help on that? thanks!
image
image

@hzjane
Copy link
Contributor

hzjane commented Nov 27, 2023

We fixed this issue(only use one core) last week. Related to this pr. When the CPU does not support bf16, qlora will automatically use only one core. You can try to use this cmd lscpu | grep bf16 to see if your CPU supports bf16 and caused by it. And You can use the latest qlora_finetuning_cpu.py to run.

@ernleite
Copy link
Author

We fixed this issue(only use one core) last week. Related to this pr. When the CPU does not support bf16, qlora will automatically use only one core. You can try to use this cmd lscpu | grep bf16 to see if your CPU supports bf16 and caused by it. And You can use the latest qlora_finetuning_cpu.py to run.

Wow! amazing thanks.
I can confirm that it works really better.
For the moment, it only works on a CPU (I have 2) but maybe it is just a misconfiguration. I am deep diving on that now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants