Fork main #1523

AmazeQiu · 2023-10-31T14:32:37Z

Feature #182
Because I need to use baichuan2-13B with more than one lora adapters at the same time, I tried to implement these features by myself. It can work well for my situation now. And this feature was mentioned in #182. Welcome to give me some comments, and I'll try my best to modify them.

Add Features

Support Baichuan2-13B
Support multi-lora adapters in a single batch inference

I use peft to implement multi-lora adapters. And in this situation, because we want to use more than one lora adapters, we can't merge the lora weights into the base model. So there will be some extra computation which will increase the latency. If there is only one lora adapter that you want to use, just do not use this feature. And I'm still working on how to implement a more efficient version of multi-lora adapters in a single batch.

Changes of files

requirements.txt
Add peft for lora adapters
tests/kernels/test_blora.py
Test scripts for multi-lora computation
tests/kernels/test_normhead.py
Test scripts for NormHead layer which is used in baichuan2-13B
vllm/engine/arg_utils.py
Add two args which are used to initialize lora adapters when load the model
vllm/engine/async_llm_engine.py
Check the lora config args are valid.
vllm/engine/llm_engine.py
Check the lora config args are valid. And pass the lora config to workers.
vllm/entrypoints/llm.py
Add lora config parameters and pass them to llm_engine
vllm/model_executor/lora_utils.py
Create lora adapters and replace the target module in the base model
vllm/model_executor/model_loader.py
Support baichuan2 and add lora adapters when initialize model.
vllm/model_executor/models/init.py
Support baichuan2
vllm/model_executor/models/baichuan.py
Support baichuan2 and schedule the lora information after each iteration according to the metadata.
Impelement the method to load lora weights in parallel.
vllm/model_executor/parallel_utils/layers.py
Implement the lora module with ColumnParallelLinear and Row ParallelLinear
vllm/sampling_params.py
Add lora_id parameter to specify the lora adapter you want to use for this prompt.
vllm/worker/worker.py
pass the lora config to initialize the model

Thanks and looking forward to your comments!

amazeqiu added 30 commits October 4, 2023 21:31

feat 增加多lora支持

b965ac9

feat 增加多lora

4ef76d8

add debug info

abbf139

add debug info

2cbcd3d

add debug info

b97593d

remove debug info

f366a44

替换baichuan的mlp层试试

c17f68e

替换baichuan的mlp层试试

aba0e2a

修改alibi算法

3f8dfad

修改alibi算法

0e072ca

修改alibi算法

76511bf

修改alibi算法 slopes 类型

a870d07

修改alibi算法 slopes 类型

f9bf968

改回原本的alibi算法

cbfd8d7

alibi mask modified

140c586

alibi mask modified

76656bc

修改RMSNorm实现方式于hf对齐

b7fa377

修复参数名不一致

0d2d5a4

同时替换RMSNorm和ALIBI

8b7f4d0

改回原来的RMSNorm,只换ALIBI

18fd745

修改会vllm的ALIBI

aa0412d

修改测试代码

83c3b35

减少rmsnorm的相对误差

0f4ca63

增加attention测试的绝对误差要求

13f40f2

debug 改回原来的mlp层

680a5f0

debug 打印mlp过程

0936827

修改attention测试指标为1e-4

e428f24

删除debug代码

a0bcbfe

feat 优化Blora，支持分布式推理

4860936

fix 修复调度Blora错误

b5719e6

amazeqiu added 29 commits October 24, 2023 16:41

删除debug代码

4453a13

增加debug代码

b347118

增加debug代码

4192824

增加debug代码

fdfbcfa

增加debug代码

a8da8ee

增加debug代码

9add9ef

删除输入代码

345c6a9

删除部分debug代码

a753973

增加debug代码

9987bc8

删除部分debug代码

276abb8

增加debug代码

51ddac6

修改校验early_stopping逻辑，改为与最好的结果比较

db0d561

修正early stop中的变量名

4b1b783

lora加载放在模型初始化时

9abcccf

修改lora加载参数的方式

466f73c

删除debug代码注释

552afc0

删除注释调的代码

82cd2fd

增加注释，模型测试脚本复原

e372321

merge from lastest main branch

da69e59

delete debug code

8f41b7c

fix fix invalid escape sequence

553ab5e

Merge branch 'main' of https://github.com/AmazeQiu/vllm into fork-main

8d1e7c4

remove debug code

00a5221

fix recover to original version of beam search

cf62abe

fix remove debug code

ca21e6d

recover to 0.2.1 version

c17b052

format test scripts and recover to the original version

5cb27f1

remove debug code

9d96158

format code

7cc126d

AmazeQiu closed this by deleting the head repository Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fork main #1523

Fork main #1523

AmazeQiu commented Oct 31, 2023

Fork main #1523

Fork main #1523

Conversation

AmazeQiu commented Oct 31, 2023

Add Features

Changes of files