diff --git a/README.md b/README.md index f78771b..da7c38a 100644 --- a/README.md +++ b/README.md @@ -250,6 +250,12 @@ CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 inference_multigpu_demo.py + +#### 关联项目 +- [shibing624/ChatPDF](https://github.com/shibing624/ChatPDF):基于本地 LLM 做检索知识问答(RAG) +- [shibing624/chatgpt-webui](https://github.com/shibing624/chatgpt-webui):给 LLM 对话和检索知识问答(RAG)提供一个简单好用的Web UI界面 + + ## 📚 Dataset ### 医疗数据集 @@ -296,21 +302,11 @@ CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 inference_multigpu_demo.py -## ⚠️ 局限性、使用限制与免责声明 - -基于当前数据和基础模型训练得到的SFT模型,在效果上仍存在以下问题: - -1. 在涉及事实性的指令上可能会产生违背事实的错误回答。 - -2. 对于具备危害性的指令无法很好的鉴别,由此会产生危害性言论。 - -3. 在一些涉及推理、代码、多轮对话等场景下模型的能力仍有待提高。 - -基于以上模型局限性,我们要求开发者仅将我们开源的模型权重及后续用此项目生成的衍生物用于研究目的,不得用于商业,以及其他会对社会带来危害的用途。 +## ⚠️ LICENSE 本项目仅可应用于研究目的,项目开发者不承担任何因使用本项目(包含但不限于数据、模型、代码等)导致的危害或损失。详细请参考[免责声明](https://github.com/shibing624/MedicalGPT/blob/main/DISCLAIMER)。 -项目代码的授权协议为 [The Apache License 2.0](/LICENSE),代码可免费用做商业用途,模型权重和数据只能用于研究目的。请在产品说明中附加MedicalGPT的链接和授权协议。 +Medical项目代码的授权协议为 [The Apache License 2.0](/LICENSE),代码可免费用做商业用途,模型权重和数据只能用于研究目的。请在产品说明中附加MedicalGPT的链接和授权协议。 ## 😇 Citation diff --git a/README_EN.md b/README_EN.md index 85b2524..789c68e 100644 --- a/README_EN.md +++ b/README_EN.md @@ -58,7 +58,6 @@ Based on the llama-7b model, use medical encyclopedia data to continue pre-train ```shell -cd scripts sh run_pt.sh ``` @@ -70,7 +69,6 @@ Based on the llama-7b-pt model, the llama-7b-sft model is obtained by using medi Supervised fine-tuning of the base llama-7b-pt model to create llama-7b-sft ```shell -cd scripts sh run_sft.sh ``` @@ -93,7 +91,6 @@ Based on the llama-7b-sft model, the reward preference model is trained using me Reward modeling using dialog pairs from the reward dataset using the llama-7b-sft to create llama-7b-reward: ```shell -cd scripts sh run_rm.sh ``` [Training Detail wiki](https://github.com/shibing624/MedicalGPT/wiki/Training-Details) @@ -113,8 +110,7 @@ This process is roughly divided into three steps: Reinforcement Learning fine-tuning of llama-7b-sft with the llama-7b-reward reward model to create llama-7b-rl ```shell -cd scripts -sh run_rl.sh +sh run_ppo.sh ``` [Training Detail wiki](https://github.com/shibing624/MedicalGPT/wiki/Training-Details) @@ -122,7 +118,7 @@ sh run_rl.sh After the training is complete, now we load the trained model to verify the effect of the model generating text. ```shell -python scripts/inference.py \ +python inference.py \ --base_model path_to_llama_hf_dir \ --lora_model path_to_lora \ --with_prompt \ @@ -165,14 +161,6 @@ Parameter Description: - Guanaco dataset with 690,000 Chinese instructions (500,000 Belle + 190,000 Guanaco): [Chinese-Vicuna/guanaco_belle_merge_v1.0](https://huggingface.co/datasets/Chinese-Vicuna/guanaco_belle_merge_v1.0) - 220,000 Chinese medical dialogue datasets (HuatuoGPT project): [FreedomIntelligence/HuatuoGPT-sft-data-v1](https://huggingface.co/datasets/FreedomIntelligence/HuatuoGPT-sft-data-v1) -## ✅ Todo - -1. [ ] Added multi-round dialogue data fine-tuning method -2. [x] add reward model finetuning -3. [x] add rl finetuning -4. [x] add medical reward dataset -5. [x] add llama in8/int4 training -6. [ ] add all training and predict demo in colab ## ☎️ Contact - Issue (suggestion) @@ -182,16 +170,8 @@ Parameter Description: -## ⚠️ Limitations, Restrictions of Use and Disclaimer - -The SFT model trained based on the current data and the basic model still has the following problems in terms of effect: - -1. Wrong answers that contradict the facts may be generated on the factual instructions. -2. Unable to identify harmful instructions well, resulting in harmful speech. -3. The ability of the model still needs to be improved in some scenarios involving reasoning, code, and multiple rounds of dialogue. +## ⚠️ LICENSE -Based on the limitations of the above models, we require developers to only use our open source model weights and subsequent derivatives generated by this project for research purposes, and not for commercial use, and other purposes that will cause harm to society. -This project can only be used for research purposes, and the project developer is not responsible for any harm or loss caused by the use of this project (including but not limited to data, models, codes, etc.). For details, please refer to [Disclaimer](https://github.com/shibing624/MedicalGPT/blob/main/DISCLAIMER). The license agreement for the project code is [The Apache License 2.0](/LICENSE), the code is free for commercial use, and the model weights and data can only be used for research purposes. Please attach MedicalGPT's link and license agreement in the product description. ## 😇 Citation