Skip to content

Commit

Permalink
update llama3 template.
Browse files Browse the repository at this point in the history
  • Loading branch information
shibing624 committed Apr 24, 2024
1 parent 726bd2a commit 4ee9b77
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 24 deletions.
38 changes: 20 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Supervised Finetuning, RLHF(Reward Modeling and Reinforcement Learning) and DPO(
- DPO方法来自论文[Direct Preference Optimization:Your Language Model is Secretly a Reward Model](https://arxiv.org/pdf/2305.18290.pdf)
- ORPO方法来自论文[ORPO: Monolithic Preference Optimization without Reference Model](https://arxiv.org/abs/2403.07691)
## 🔥 News
[2024/04/24] v2.0版本:支持了 **Meta Llama 3** 系列模型。

[2024/04/17] v1.9版本:支持了 **[ORPO](https://arxiv.org/abs/2403.07691)**,详细用法请参照 `run_orpo.sh`。详见[Release-v1.9](https://github.com/shibing624/MedicalGPT/releases/tag/1.9.0)

Expand Down Expand Up @@ -146,25 +147,26 @@ Training Stage:

#### Supported Models

| Model Name | Model Size | Template |
|----------------------------------------------------------------------|-----------------------------|-----------|
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | vicuna |
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | alpaca |
| [LLaMA2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) | 7B/8x7B | mistral |
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | baichuan |
| Model Name | Model Size | Template |
|----------------------------------------------------------------------|-----------------------------|----------|
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | vicuna |
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | alpaca |
| [LLaMA2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [LLaMA3](https://huggingface.co/meta-llama) | 8B/70B | llama3 |
| [Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) | 7B/8x7B | mistral |
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | baichuan |
| [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B | baichuan2 |
| [InternLM](https://github.com/InternLM/InternLM) | 7B | intern |
| [Qwen](https://github.com/QwenLM/Qwen) | 1.8B/7B/14B/72B | chatml |
| [Qwen1.5](https://github.com/QwenLM/Qwen1.5) | 0.5B/1.8B/4B/14B/72B | qwen |
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | 13B | xverse |
| [ChatGLM](https://github.com/THUDM/ChatGLM-6B) | 6B | chatglm |
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | chatglm2 |
| [ChatGLM3](https://github.com/THUDM/ChatGLM3) | 6B | chatglm3 |
| [Yi](https://github.com/01-ai/Yi) | 6B/34B | yi |
| [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM) | 7B/16B/67B | deepseek |
| [Orion](https://github.com/OrionStarAI/Orion) | 14B | orion |
| [Cohere](https://huggingface.co/CohereForAI/c4ai-command-r-plus) | 104B | cohere |
| [InternLM](https://github.com/InternLM/InternLM) | 7B | intern |
| [Qwen](https://github.com/QwenLM/Qwen) | 1.8B/7B/14B/72B | chatml |
| [Qwen1.5](https://github.com/QwenLM/Qwen1.5) | 0.5B/1.8B/4B/14B/72B | qwen |
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | 13B | xverse |
| [ChatGLM](https://github.com/THUDM/ChatGLM-6B) | 6B | chatglm |
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | chatglm2 |
| [ChatGLM3](https://github.com/THUDM/ChatGLM3) | 6B | chatglm3 |
| [Yi](https://github.com/01-ai/Yi) | 6B/34B | yi |
| [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM) | 7B/16B/67B | deepseek |
| [Orion](https://github.com/OrionStarAI/Orion) | 14B | orion |
| [Cohere](https://huggingface.co/CohereForAI/c4ai-command-r-plus) | 104B | cohere |

## 💻 Inference
训练完成后,现在我们加载训练好的模型,验证模型生成文本的效果。
Expand Down
13 changes: 7 additions & 6 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ sh run_ppo.sh
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | vicuna |
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | alpaca |
| [LLaMA2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [LLaMA3](https://huggingface.co/meta-llama) | 8B/70B | llama3 |
| [Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) | 7B/8x7B | mistral |
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | baichuan |
| [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B | baichuan2 |
Expand Down Expand Up @@ -151,12 +152,12 @@ pip install -r requirements.txt --upgrade

### Hardware Requirement (VRAM)

| Method | Bits | 7B | 13B | 30B | 65B | 8x7B |
| ------ | ---- | ----- | ----- | ----- | ------ | ------ |
| Full | 16 | 160GB | 320GB | 600GB | 1200GB | 900GB |
| LoRA | 16 | 16GB | 32GB | 80GB | 160GB | 120GB |
| QLoRA | 8 | 10GB | 16GB | 40GB | 80GB | 80GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 32GB |
| Train Method | Bits | 7B | 13B | 30B | 65B | 8x7B |
|--------------| ---- | ----- | ----- | ----- | ------ | ------ |
| Full | 16 | 160GB | 320GB | 600GB | 1200GB | 900GB |
| LoRA | 16 | 16GB | 32GB | 80GB | 160GB | 120GB |
| QLoRA | 8 | 10GB | 16GB | 40GB | 80GB | 80GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 32GB |

## 🔥 Inference
After the training is complete, now we load the trained model to verify the effect of the model generating text.
Expand Down
15 changes: 15 additions & 0 deletions supervised_finetuning.py
Original file line number Diff line number Diff line change
Expand Up @@ -552,6 +552,21 @@ def register_conv_template(template: Conversation):
)
)

"""llama3 template
source: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2
Supports: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
"""
register_conv_template(
Conversation(
name="llama3",
system_prompt="",
messages=[],
roles=("user", "assistant"),
prompt="<|start_header_id|>user<|end_header_id|>\n\n{query}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
sep="<|eot_id|>",
)
)

"""llama2-zh template
source: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2
Supports: https://huggingface.co/ziqingyang/chinese-alpaca-2-7b
Expand Down

0 comments on commit 4ee9b77

Please sign in to comment.