Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

[DOC]Add modelscope example #1578

Merged
merged 8 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions examples/modelscope/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# ModelScope with ITREX

Intel extension for transformers(ITREX) support almost all the LLMs in Pytorch format from ModelScope such as phi, Qwen, ChatGLM, Baichuan, gemma, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel extension for transformers
Intel® Extension for Transformers


## Usage Example

ITREX provides a script that demonstrates the use of modelscope. Run it with the following command:
```bash
numactl -m 0 -C 0-55 python run_modelscope_example.py --model_path=qwen/Qwen-7B --prompt=你好
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numactl -l -C xx-xx? if user across sockets?
Add a note explaining why adding numactl is necessary (to improve performance and teach them how to bind core_id).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numactl -m 0 -C 0-55 python run_modelscope_example.py --model_path=qwen/Qwen-7B --prompt=你好
change to
OMP_NUM_THREADS= numactl -m -C python run_modelscope_example.py
--model <MODEL_NAME_OR_PATH>
--prompt=你好

```

## Supported and Validated Models
We have validated the majority of existing models using modelscope==1.13.1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the requirment.txt

* [qwen/Qwen-7B](https://www.modelscope.cn/models/qwen/Qwen-7B/summary)
* [ZhipuAI/ChatGLM-6B](https://www.modelscope.cn/models/ZhipuAI/ChatGLM-6B/summary)
* [ZhipuAI/chatglm2-6b](https://www.modelscope.cn/models/ZhipuAI/chatglm2-6b/summary)
* [ZhipuAI/chatglm3-6b](https://www.modelscope.cn/models/ZhipuAI/chatglm3-6b/summary)
* [baichuan-inc/Baichuan2-7B-Chat](https://www.modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary)
* [baichuan-inc/Baichuan2-13B-Chat](https://www.modelscope.cn/models/baichuan-inc/Baichuan2-13B-Chat/summary)
* [LLM-Research/Phi-3-mini-4k-instruct](https://www.modelscope.cn/models/LLM-Research/Phi-3-mini-4k-instruct/summary)
* [LLM-Research/Phi-3-mini-128k-instruct](https://www.modelscope.cn/models/LLM-Research/Phi-3-mini-128k-instruct/summary)
* [AI-ModelScope/gemma-2b](https://www.modelscope.cn/models/AI-ModelScope/gemma-2b/summary)

If you encounter any problems, please let us know.
30 changes: 30 additions & 0 deletions examples/modelscope/run_modelscope_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from transformers import TextStreamer
from modelscope import AutoTokenizer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
from typing import List, Optional
import argparse

def main(args_in: Optional[List[str]] = None) -> None:
parser = argparse.ArgumentParser()
parser.add_argument("--model_path", type=str, help="Model name: String", required=True, default="qwen/Qwen-7B")
parser.add_argument(
"-p",
"--prompt",
type=str,
help="Prompt to start generation with: String (default: empty)",
default="你好,你可以做点什么?",
)
parser.add_argument("--benchmark", action="store_true")
parser.add_argument("--use_neural_speed", action="store_true")
args = parser.parse_args(args_in)
print(args)
model_name = args.model_path # Modelscope model_id or local model
prompt = args.prompt
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True, model_hub="modelscope")
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)

if __name__ == "__main__":
main()
Loading