Fix LoRA Fuse/Unfuse in Hybrid Engine #3563

sxjscience · 2023-05-18T00:48:08Z

Implements fuse_lora and unfuse_lora in InferenceContainers. It should fix [BUG]RuntimeError: The size of tensor a (6144) must match the size of tensor b (8192) at non-singleton dimension 0 #3543
Add set_q_k_v in GPTNeo
Use bfloat16 in inference if it is enabled.

…ngine

sxjscience · 2023-05-18T22:01:51Z

@microsoft-github-policy-service agree

sxjscience · 2023-05-29T08:26:48Z

@awan-10 @jeffra the unit-test has passed. Would you be able to review the PR?

cmikeh2

This looks great! Thank you!

awan-10 · 2023-06-30T20:55:29Z

@sxjscience - Thank you so much for this detailed PR and investigation! Sorry for the delay in reviewing!

awan-10 · 2023-06-30T22:51:08Z

@sxjscience - I have put this PR to auto-merge when the tests pass but I see that the latest round of tests has one failing nv-inference test. Can you please take a look and resolve the issue?

sxjscience · 2023-06-30T22:57:52Z

Will take a look later today.

deepspeed/runtime/engine.py

sxjscience · 2023-07-03T22:57:35Z

@awan-10 Would you have time to take a look at the latest change? CI has been fixed.

sxjscience added 3 commits May 16, 2023 00:40

fix lora fuse unfuse in hybrid_engine

458acd9

fix name

53c378d

fix typo

be50556

sxjscience requested review from jeffra, tjruwase, RezaYazdaniAminabadi, mrwyattii, awan-10, cmikeh2 and arashb as code owners May 18, 2023 00:48

sxjscience and others added 7 commits May 17, 2023 18:27

Merge remote-tracking branch 'upstream/master' into fix_lora_hybrid_e…

c11fdf4

…ngine

remove empty lines

75205db

Update gptj.py

8399fe1

add lora test-case + fix gptneo implementation

4184227

Merge branch 'master' into fix_lora_hybrid_engine

dd661da

try to fix format

c2fdb17

try to accelerate testcase by reducing max length

93a1131

awan-10 and others added 4 commits May 23, 2023 10:25

Merge branch 'master' into fix_lora_hybrid_engine

4bb0ccb

Merge branch 'master' into fix_lora_hybrid_engine

d343ed4

reduce test runtime

23fbb0c

Fix bloom / gpt-neox and add test for bloom

7fc00fe

sxjscience and others added 2 commits May 30, 2023 19:46

Merge branch 'master' into fix_lora_hybrid_engine

559625b

Merge branch 'master' into fix_lora_hybrid_engine

c041c62

cmikeh2 approved these changes Jun 30, 2023

View reviewed changes

awan-10 enabled auto-merge (squash) June 30, 2023 20:54

Merge branch 'master' into fix_lora_hybrid_engine

0c676a3

awan-10 added the merge-queue label Jun 30, 2023

fix CI + fix issue in engine

955483c

auto-merge was automatically disabled July 2, 2023 04:29
Head branch was pushed to by a user without write access

sxjscience commented Jul 2, 2023

View reviewed changes

deepspeed/runtime/engine.py Show resolved Hide resolved

Merge branch 'master' into fix_lora_hybrid_engine

3a52c76

Merge branch 'master' into fix_lora_hybrid_engine

7a4cd76

tjruwase approved these changes Jul 5, 2023

View reviewed changes

tjruwase merged commit d81dfda into deepspeedai:master Jul 5, 2023

This was referenced Jul 5, 2023

Update test_he_lora.py (fix typo) #3881

Closed

"RuntimeError: The size of tensor a (5120) must match the size of tensor b (20480) at non-singleton dimension 0" in step3 deepspeedai/DeepSpeedExamples#622

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LoRA Fuse/Unfuse in Hybrid Engine #3563

Fix LoRA Fuse/Unfuse in Hybrid Engine #3563

sxjscience commented May 18, 2023 •

edited

Loading

sxjscience commented May 18, 2023

sxjscience commented May 29, 2023

cmikeh2 left a comment

awan-10 commented Jun 30, 2023

awan-10 commented Jun 30, 2023

sxjscience commented Jun 30, 2023

sxjscience commented Jul 3, 2023

Fix LoRA Fuse/Unfuse in Hybrid Engine #3563

Fix LoRA Fuse/Unfuse in Hybrid Engine #3563

Conversation

sxjscience commented May 18, 2023 • edited Loading

sxjscience commented May 18, 2023

sxjscience commented May 29, 2023

cmikeh2 left a comment

Choose a reason for hiding this comment

awan-10 commented Jun 30, 2023

awan-10 commented Jun 30, 2023

sxjscience commented Jun 30, 2023

sxjscience commented Jul 3, 2023

sxjscience commented May 18, 2023 •

edited

Loading