Skip to content

Commit

Permalink
Merge pull request #370 from ZhuangXialie/main
Browse files Browse the repository at this point in the history
增加中文数据集汇总,本项目支持格式
  • Loading branch information
shibing624 authored Apr 29, 2024
2 parents ee61d40 + 9aa010f commit ec77bea
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,7 @@ CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 inference_multigpu_demo.py
- 80万条中文ChatGPT多轮对话数据集:[BelleGroup/multiturn_chat_0.8M](https://huggingface.co/datasets/BelleGroup/multiturn_chat_0.8M)
- 116万条中文ChatGPT多轮对话数据集:[fnlp/moss-002-sft-data](https://huggingface.co/datasets/fnlp/moss-002-sft-data)
- 3.8万条中文ShareGPT多轮对话数据集:[FreedomIntelligence/ShareGPT-CN](https://huggingface.co/datasets/FreedomIntelligence/ShareGPT-CN)
- 中文微调数据集汇总:[zhuangxialie/Llama3-Chinese-Dataset](https://modelscope.cn/datasets/zhuangxialie/Llama3-Chinese-Dataset/dataPeview) [本项目支持格式]

#### Preference datasets(偏好数据集)
- 2万条中英文偏好数据集:[shibing624/DPO-En-Zh-20k-Preference](https://huggingface.co/datasets/shibing624/DPO-En-Zh-20k-Preference) [本项目支持格式]
Expand Down

0 comments on commit ec77bea

Please sign in to comment.