Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: --chat-template seems to be broken now, no way to truly chat from the llama-cli #8053

Closed
Deputation opened this issue Jun 21, 2024 · 3 comments · Fixed by #8068
Closed
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)

Comments

@Deputation
Copy link

What happened?

As per discussions:

#7837
#8009

It seems to be impossible to chat with llama3 8b properly. I have not tested this on 70b models but even in the server UI the model just starts making notes to itself and output garbage / training data as to how it should converse instead of actually conversing. Has something happened to the --chat-template chatml parameter? Even when the CLI is set to output special tokens, I do not see the ChatML tokens coming out.

Name and Version

version: 3158 (5239925)

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

@Deputation Deputation added bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) labels Jun 21, 2024
@ericonr
Copy link

ericonr commented Jun 21, 2024

I'm configuring the prompt as suggested by this comment #6747 (comment), and it's worked pretty well.

@dspasyuk
Copy link
Contributor

dspasyuk commented Jun 21, 2024

@Deputation I agree, the same issue I see on my end with llama3-instruct 8b. I have been told to use "right" prompt style but even with Llama3 prompt style it gives ok response maybe once then just give random garbage in llama-cli. The same issue with the llama-server. Version B3080 works fine after that no luck. Also when ctx size get exceeded from multiple questions, model just stops generating altogether. I am still waiting for @ggerganov and team on this issue. Screencast from 2024-06-21 11:10:00 AM.webm

@dspasyuk
Copy link
Contributor

dspasyuk commented Jun 22, 2024

@Deputation try running llama-cli like this:

../llama.cpp/llama-cli --model models/meta-llama-3-8b-instruct_q5_k_s.gguf --n-gpu-layers 35 -cnv  --interactive-first  --simple-io  -b 512 -n -1 --ctx_size 0 --temp 0.3 --top_k 10 --multiline-input  --repeat_penalty 1.12 -t 6 -r "\n>" --log-disable  -p 'Role and Purpose: You are Alice, a large language model. Your purpose is to assist users by providing information, answering questions, and engaging in meaningful conversations based on the data you were trained on.
Behavior and Tone:  Be informative, engaging, and respectful. Maintain a neutral and unbiased tone. Ensure that responses are clear and concise. Capabilities: Use your training data to provide accurate and relevant information. Explain complex concepts in an easy-to-understand manner. Provide sources when referencing specific information or data.
Output Formatting: Use this formatting for code: ```language\n```' 

And then chat like so: <|im_start|>user
Answer the following questions:

  1. The day before two days after the day before tomorrow is Saturday. What day is it today?
  2. What is the square root of 169?
  3. Solve the equation 3y = 6y + 11 and find y.
  4. There are two ducks in front of a duck, two ducks behind a duck, and a duck in the middle. How many ducks are there?
  5. How many days does it take to travel from New York City to London by plane, assuming non-stop flights and average speeds?
  6. What are the products of the chemical reaction between salicylic acid and acetic anhydride?
  7. If five cats can catch five mice in five minutes, how long will it take one cat to catch one mouse?
  8. Create a JS program that prints the first 100 Fibonacci numbers. <|im_end|>
    <|im_start|>assistant

I do not know why it works but it does, setting --chat-template to llama3 and using the correct reverse prompt for it does not work as expected in my hands. :(

You can try this out with llama.cui test branch: https://github.com/dspasyuk/llama.cui/tree/test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants