用文中给定的预训练模型SpeechGPT-7B-ma，判断一句话是否说完，回答不正确 #59

Evoluange · 2024-11-21T12:54:07Z

用文中给定的预训练模型SpeechGPT-7B-ma，判断一句话是否说完，用的Train SpeechGPT中提供的token，去掉最后的送到模型进行推理，但是输出的结果是一些token但不是表示结束的token""。代码如下：
import torch
import transformers
from transformers import AutoConfig, LlamaForCausalLM, LlamaTokenizer, GenerationConfig

device = "cuda"

model_dir = "SpeechGPT/speechgpt/7B-ma"
model = LlamaForCausalLM.from_pretrained(
model_dir,
load_in_8bit=False,
torch_dtype=torch.float16,
device_map="auto",
)
model.half()
model.eval()
if torch.version >= "2" and sys.platform != "win32":
model = torch.compile(model)

tokenizer = LlamaTokenizer.from_pretrained(model_dir)
tokenizer.pad_token_id = (0)
tokenizer.padding_side = "left"

#去掉最后的结束标志
audio_tokens ="<189><247><922><991><821><258><485><974><284><466><969><523><196><202><881><331><822><853><432><32><742><98><519><26><204><280><576><384><879><901><555><944><366><641><124><362><734><156><824><462><761><907><430><81><597><716><205><521><470><821><677><355><483><641><124><243><290><978><82><620><915><470><821><576><384><466><398><212><455><931><579><969><778><45><914><445><469><576><803><6><803><791><377><506><835><67><940><613><417><755><237><224><452><121><736>"#“”

model_inputs = tokenizer([audio_tokens], return_tensors="pt").to(device)
print("model_inputs.input_ids:", model_inputs.input_ids)
print("len(model_inputs.input_ids):",len(model_inputs.input_ids))
print("len(model_inputs.input_ids[0]):",len(model_inputs.input_ids[0]))

generation_config = GenerationConfig(
temperature=0.7,
top_p=0.8,
top_k=50,
do_sample=True,
max_new_tokens=20,
min_new_tokens=10,
)

generated_ids = model.generate(
input_ids=model_inputs.input_ids,
generation_config=generation_config,
#return_dict_in_generate=True,
output_scores=True,
max_new_tokens=10,
)

generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print("response:", response)
输出结果：

理论上模型response回复的第一个token应该是音频的结束符啊，请教原因~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

用文中给定的预训练模型SpeechGPT-7B-ma，判断一句话是否说完，回答不正确 #59

用文中给定的预训练模型SpeechGPT-7B-ma，判断一句话是否说完，回答不正确 #59

Evoluange commented Nov 21, 2024 •

edited

Loading

用文中给定的预训练模型SpeechGPT-7B-ma，判断一句话是否说完，回答不正确 #59

用文中给定的预训练模型SpeechGPT-7B-ma，判断一句话是否说完，回答不正确 #59

Comments

Evoluange commented Nov 21, 2024 • edited Loading

Evoluange commented Nov 21, 2024 •

edited

Loading