-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misc. bug: Problems with official jinja templates (Gemma 2, Llama 3.2, Qwen 2.5) #11866
Comments
Thanks for pointing out. I'm having the same error as well. I didn't use jinja template until llama.cpp supports tool calling, so didn't notice until I switch to tool calling. Right now I'm trying to location which commit introduce the bug. |
Hey @MoonRide303 , @henryclw , thanks for reporting this! Are you both experiencing this on Windows? Could you try fetching the template with (these templates seem to work on my mac, maybe some line ending issue or bad unescaping of the JSON string if editing them manually?) |
My finding:
But if you use llama.cpp with curl http://localhost:8080/v1/chat/completions -d '{
"model": "gpt-3.5-turbo",
"tools": [ ],
"messages": [
{
"role": "user",
"content": "Print a hello world message with python."
}
]
}' I'm not sure if the jinja template option must comes with the tools option, and what is the expected behavior and usage? Hope this finding might be helpful. If you need any help please feel free to reply. |
More detailed logs: Without tools: curl http://localhost:8080/v1/chat/completions -d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Where is Vancouver?"
}
]
}' Logs for API call without tools
With tools: curl http://localhost:8080/v1/chat/completions -d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Where is Vancouver?"
}
],
"tools": []
}' Logs for API call with tools
I think the problem might be without tools, without grammar, with jinja template, the grammar is not set correctly? |
Thank you for the quick reply. I just compiled your fix branch locally and it solved the problem. |
@ochafik This script doesn't work for me:
I just directly copied content of chat_template field from the tokenizer_config.json files I've linked in the first post - attaching it here (as .txt, as GitHub blocks .jinja). |
@henryclw That content is JSON-escaped / not valid Jinja; to use it you can paste the chat_template string to a JavaScript console and wrap it with a with open(hf_hub_download(repo_id=model_id, filename="tokenizer_config.json"), "r", encoding="utf-8") as f:
config_str = f.read() |
@ochafik It was me who attached those files. And... you're absolutely right it was JSON escaping causing all the troubles here. I've made simpler and working version of the script for acquiring chat templates (as an alternative for broken scripts/get_chat_template.py - maybe it should be added to the repo scripts?), and with proper JSON decoding it seems official templates are working, now (attaching correct versions of those). get_hf_template.py.txt Could you add some kind of error when template is not a valid Jinja? It would be easier to avoid that kind of mistakes in future, then. |
Signed-off-by: MoonRide303 <[email protected]>
@MoonRide303 it should already print quite a lengthy error message (if you scroll right you'll see the common_chat_templates_from_model: failed to parse chat template: Expected value expression at row 1, column 269:
{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n
^
|
@ochafik When I try to launch it with that earlier (broken) llama3.2.jinja it just silently quits after printing device info:
Same output from both my local build, and the official binaries (llama-b4734-bin-win-cuda-cu12.4-x64.zip). |
Signed-off-by: MoonRide303 <[email protected]>
Name and Version
llama-cli --version
version: 4713 (a4f011e)
built with MSVC 19.42.34436.0 for x64
Operating systems
Windows
Which llama.cpp modules do you know to be affected?
llama-server
Command line
Problem description & steps to reproduce
Extracting official chat templates from chat_template field in tokenizer_config.json (Gemma 2, Llama 3.2, Qwen 2.5), storing them in files, and then trying to use them with llama-server results in errors.
parse: error parsing grammar: expecting name at
after each message.parse: error parsing grammar: expecting name at
after each message.@ochafik Could you look into this? It would be nice to have jinja implementation fully working with official templates, at least for major models.
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: