Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

ITREX need to do modification for llama3 new prompt format #1507

Open
redhairerINTEL opened this issue Apr 23, 2024 · 3 comments
Open

ITREX need to do modification for llama3 new prompt format #1507

redhairerINTEL opened this issue Apr 23, 2024 · 3 comments
Assignees

Comments

@redhairerINTEL
Copy link

New prompt format for llama3
https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/

@kta-intel
Copy link
Contributor

@kevinintel

@a32543254
Copy link
Contributor

a32543254 commented Apr 25, 2024

here is the sample code if you want to use llama3 template:
all you need is to apply template to input_ids.

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)


outputs = model.generate(input_ids , streamer=streamer)

We will also add it to doc soon.

@N3RDIUM
Copy link

N3RDIUM commented Apr 30, 2024

here is the sample code if you want to use llama3 template: all you need is to apply template to input_ids.

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)


outputs = model.generate(input_ids , streamer=streamer)

We will also add it to doc soon.

This gives me AssertionError: Fail to convert pytorch model

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants