CPU/QLoRA-FineTuning #9406

ernleite · 2023-11-09T13:21:39Z

Hello
I am trying to fine tune a LLama2 model

Actually the finetuning process is taking a very long time so I had to cancel it because it is using only one core in my machine (DELL R730 with 2 CPUS / 56 Logicals cores)
I tried accelerate config but it is not working
Any idea?
Thanks !!

jason-dai · 2023-11-09T13:28:43Z

Do we need to source bigdl-llm-init for QLoRA? @qiyuangong @hzjane

hzjane · 2023-11-10T01:33:08Z

Do we need to source bigdl-llm-init for QLoRA? @qiyuangong @hzjane

I think it's ok, I'll add it to the readme file.

hzjane · 2023-11-10T01:36:30Z

Hello I am trying to fine tune a LLama2 model

Actually the finetuning process is taking a very long time so I had to cancel it because it is using only one core in my machine (DELL R730 with 2 CPUS / 56 Logicals cores) I tried accelerate config but it is not working Any idea? Thanks !!

Maybe you can try source bigdl-llm-init or just try taskset -c 0-27 to use more cores.

ernleite · 2023-11-10T01:58:38Z

thanks for your reply
I already did that..
It works but when it starts "converting the current model to sym_int4 format then all disapeer. Only one process remains.
Is my server R730 compatible?

in fact python command never worked for me. it only works using llm-convert, llm-cli etc.
very strange
thanks

hzjane · 2023-11-10T02:21:18Z

Please check your conda env based on https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/QLoRA-FineTuning .

ernleite · 2023-11-10T02:38:00Z

I followed the configuration since the beggining.
Thanks

glorysdj · 2023-11-10T06:58:16Z

thanks for your reply I already did that.. It works but when it starts "converting the current model to sym_int4 format then all disapeer. Only one process remains. Is my server R730 compatible?

in fact python command never worked for me. it only works using llm-convert, llm-cli etc. very strange thanks

Hi @ernleite, what do you mean python command never work? Have you tried taskset -c 0-27 to use more cores? Could you please share the commands for how you run this qlora fine tuning, we will try to check and reproduce it.

ernleite · 2023-11-10T13:02:52Z

@glorysdj
I meant all the command like c 0-X python ./generate.py or qlora_finetuning_cpu.py does not work for me.
The only commands that work (using all cores in my machine) are llm-convert or llm-cli

my configuration
DELL R730 with 2 CPUs
96 GB RAM
Ubuntu 22.04 LTS

I would be so happy if this can work

here an unresolved issue I explained few weeks ago : [https://github.com//issues/8936]

thanks !

ernleite · 2023-11-11T00:35:45Z

This screenshot showing only one core is used at a given time (100%)

jason-dai · 2023-11-11T00:36:13Z

@glorysdj I meant all the command like c 0-X python ./generate.py or qlora_finetuning_cpu.py does not work for me. The only commands that work (using all cores in my machine) are llm-convert or llm-cli

my configuration DELL R730 with 2 CPUs 96 GB RAM Ubuntu 22.04 LTS

I would be so happy if this can work

here an unresolved issue I explained few weeks ago : [https://github.com//issues/8936]

thanks !

@ernleite - a quick question: are you able to run bigdl-llm using these python commands on your local PC (either windows or linux)?

ernleite · 2023-11-11T00:38:50Z

I have a laptop running on Windows 11. Let me try. I will let you know.

ernleite · 2023-11-12T12:44:03Z

@jason-dai I used my laptop
The CPU version works fine with Windows 11 (even it took several hours). Good step then!

I have two GPUs in my laptop but was not able to use my Intel Iris Xe with 16GB
I have an issue with the pytorch librairy

I tried many configurations but the Qlora GPU version does not work. Are we sure it works with python 3.9?
The DLL is present but seems to not work. I don't know why? I installed the latest Intel GPU drivers & OneAPi too.

So my question is : does the GPU version works with Windows ?
What is the equivalent in Windows for souce bigdl-init

thanks again

Jasonzzt · 2023-11-13T08:43:34Z

This screenshot showing only one core is used at a given time (100%)

@ernleite Do you have a GPU on your machine? I tried to reproduce the issue and found that after converting the current model to sym_int4 format the finetuning program ran on the GPU.

So you can try to disable GPU when you finetune on CPU, and make sure you use the CPU version of package bigdl-llm.

Hope this can help you.

jason-dai · 2023-11-13T08:49:02Z

So my question is : does the GPU version works with Windows ?

Currently it's not supported yet

liang1wang · 2023-11-24T12:46:17Z

In my side, blocked at the process 0%(>3h) with MTL RVP when running qlora_finetuning_cpu.py,
cmd: python ./qlora_finetuning_cpu.py --repo-id-or-model-path llama-2-7b-hf --dataset english_quotes
env: MTL RVP, 8(e)+6(p) core, 96G mem, ubuntu22.04
I have run "source bigdl-llm-init -t"
Could you also help on that? thanks!

hzjane · 2023-11-27T02:25:38Z

We fixed this issue(only use one core) last week. Related to this pr. When the CPU does not support bf16, qlora will automatically use only one core. You can try to use this cmd lscpu | grep bf16 to see if your CPU supports bf16 and caused by it. And You can use the latest qlora_finetuning_cpu.py to run.

ernleite · 2023-11-27T14:19:23Z

We fixed this issue(only use one core) last week. Related to this pr. When the CPU does not support bf16, qlora will automatically use only one core. You can try to use this cmd lscpu | grep bf16 to see if your CPU supports bf16 and caused by it. And You can use the latest qlora_finetuning_cpu.py to run.

Wow! amazing thanks.
I can confirm that it works really better.
For the moment, it only works on a CPU (I have 2) but maybe it is just a misconfiguration. I am deep diving on that now.

jason-dai added the user issue label Nov 9, 2023

glorysdj assigned hzjane Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU/QLoRA-FineTuning #9406

CPU/QLoRA-FineTuning #9406

ernleite commented Nov 9, 2023

jason-dai commented Nov 9, 2023

hzjane commented Nov 10, 2023

hzjane commented Nov 10, 2023

ernleite commented Nov 10, 2023 •

edited

Loading

hzjane commented Nov 10, 2023

ernleite commented Nov 10, 2023

glorysdj commented Nov 10, 2023 •

edited

Loading

ernleite commented Nov 10, 2023 •

edited

Loading

ernleite commented Nov 11, 2023 •

edited

Loading

jason-dai commented Nov 11, 2023

ernleite commented Nov 11, 2023 •

edited

Loading

ernleite commented Nov 12, 2023 •

edited

Loading

Jasonzzt commented Nov 13, 2023 •

edited

Loading

jason-dai commented Nov 13, 2023 •

edited

Loading

liang1wang commented Nov 24, 2023

hzjane commented Nov 27, 2023 •

edited

Loading

ernleite commented Nov 27, 2023

CPU/QLoRA-FineTuning #9406

CPU/QLoRA-FineTuning #9406

Comments

ernleite commented Nov 9, 2023

jason-dai commented Nov 9, 2023

hzjane commented Nov 10, 2023

hzjane commented Nov 10, 2023

ernleite commented Nov 10, 2023 • edited Loading

hzjane commented Nov 10, 2023

ernleite commented Nov 10, 2023

glorysdj commented Nov 10, 2023 • edited Loading

ernleite commented Nov 10, 2023 • edited Loading

ernleite commented Nov 11, 2023 • edited Loading

jason-dai commented Nov 11, 2023

ernleite commented Nov 11, 2023 • edited Loading

ernleite commented Nov 12, 2023 • edited Loading

Jasonzzt commented Nov 13, 2023 • edited Loading

jason-dai commented Nov 13, 2023 • edited Loading

liang1wang commented Nov 24, 2023

hzjane commented Nov 27, 2023 • edited Loading

ernleite commented Nov 27, 2023

ernleite commented Nov 10, 2023 •

edited

Loading

glorysdj commented Nov 10, 2023 •

edited

Loading

ernleite commented Nov 10, 2023 •

edited

Loading

ernleite commented Nov 11, 2023 •

edited

Loading

ernleite commented Nov 11, 2023 •

edited

Loading

ernleite commented Nov 12, 2023 •

edited

Loading

Jasonzzt commented Nov 13, 2023 •

edited

Loading

jason-dai commented Nov 13, 2023 •

edited

Loading

hzjane commented Nov 27, 2023 •

edited

Loading