Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Potential Bug] Only BOS token when the user only provides system message for LLAMA-3 #15

Closed
yangalan123 opened this issue May 28, 2024 · 15 comments
Labels
use case discussion Discussion on use cases

Comments

@yangalan123
Copy link

Hi,

Thanks for maintaining this great repo. I recently find a potential bug: if the user only input the system message, applying LLAMA-3-Instruct template will only return <|begin_of_text|> and completely ignore the provided system message. I guess the problem is in {% set loop_messages = messages[1:] %} when len(messages)==1.

Here is a minimal example to reproduce:

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
chat_template_path = "chat_templates/llama-3-instruct.jinja"
chat_template = open(chat_template_path).read()
chat_template = chat_template.replace('    ', '').replace('\n', '')
tokenizer.chat_template = chat_template
prompts = ["Please help me classify:", "Hello!"]
processed_prompts = [tokenizer.apply_chat_template([{"role": "system", "content": p},  ], tokenize=False, add_generation_prompt=True) for p in prompts]
print(processed_prompts)

The result is:

['<|begin_of_text|>', '<|begin_of_text|>']

@chujiezheng
Copy link
Owner

chujiezheng commented May 28, 2024 via email

@yangalan123
Copy link
Author

It is more of an analysis experiment -- we may want to analyze the controlling power of system prompts, and the necessity of the existence of system prompts. We can also try to investigate whether we can incorporate user inquiry as part of the system prompt. This may be especially helpful in the setting of agent initialization.

But even if we do not consider the specific applications, I still feel this is problematic -- why do we only get "bos_token" when we only input the system message? When setting up "GPTs" at the GPT store, basically we only give a system prompt.

@chujiezheng
Copy link
Owner

I see. I think in most cases, the user input is still necessary, and the system message can be optional. I will add checks to ensure the chat_messages contain at least one user input.

@yangalan123
Copy link
Author

yangalan123 commented May 28, 2024

I think at least, when there is only a system message, the input should be ended with [EOS] as well. While I agree my situation perhaps is not "in most cases", I do want to note that 1) there does exist a recent trend that people start to evaluate at system prompts, like: [1], [2]. 2) Whenever an unexpected input occurs, at least some default choice should be implemented. I do not understand the logic that the setup of the system prompt depends on the user input. (Ideally, they should be independent.) It is weird and I do not believe many LLMs intend that.

Just a side question -- do you think you should collaborate with Huggingface to incorporate your template here to be part of their tokenizer natively? I feel this is somewhat not convenient in that every time I need to read the template here and set the tokenizer somehow.

@chujiezheng
Copy link
Owner

chujiezheng commented May 28, 2024

For 1), I think the reference [1] does not adopt official chat templates, and the reference [2] always includes a user question.

For 2), I think it is not "the system prompt depending on the user input", but "prepending the system prompt before the user input". In other words, the system prompt becomes non-sensical if there is no user input.

In fact, I generally follow the logic of looping the non-system messages in llama-2-chat.

For the side question, I will give it a look and see whether it is possible to collaborate with the HF team.

@yangalan123
Copy link
Author

I think there is a misunderstanding -- I cite those papers to prove I am not the only one looking at system prompts. They may conduct experiments in one way or another, but as people are getting interested in system prompt, I think my use case is actually not that rare.

I am confused about the comment that "the system prompt becomes non-sensical if there is no user input" -- why is that? Suppose I write "You are a helpful assistant" as a system prompt (a very commonly used one), it is valid and bears "alignment"/"principled" constraints for follow-up generations even without user inputs. It already changes the model output distribution somehow. Such a prompt is agnostic to what the user would write.

I just do not understand, why it is hard to implement some logic like (perhaps because I am new to Jinja),

if len(messages)==1 then....
else [default]

@chujiezheng
Copy link
Owner

I am implementing this logic. I am just not sure in which cases the messages can look like

[
    {"role": "system", "content": "..."}
]

but not

[
    {"role": "user", "content": "..."}
]

or

[
    {"role": "system", "content": "..."},
    {"role": "user", "content": "..."}
]

@yangalan123
Copy link
Author

Great, now it seems we are on the same page. Thanks for the effort of implementing this logic. Let's discuss more on the implementation:

My case is actually:

[
    {"role": "system", "content": "..."}
]

But I think the case for:

[
    {"role": "user", "content": "..."}
]

is also not that rare -- not many people, especially non-NLP people knows the concept of "system prompt". So I would say both situations can happen.

@chujiezheng
Copy link
Owner

chujiezheng commented May 28, 2024

Yes. Currently, the templates have supported

[
    {"role": "user", "content": "..."}
]

Again, my question is still when

[
    {"role": "system", "content": "..."}
]

will happen? If there is no user input, I think we may not need the "chat" template (as there is no chat).

@chujiezheng
Copy link
Owner

chujiezheng commented May 28, 2024

In other words, I think the user input is necessary when you apply the chat template, no matter whether you prepend the system message. This is the premise of the current implementation.

@yangalan123
Copy link
Author

yangalan123 commented May 28, 2024

I have talked about the potential application of system-only messages (I intentionally distinguish the usage of "messages" and "prompts" here. The "prompt" is the formatted "message".) in research. So let's talk more in the chat setting:

Suppose we want to build a streaming chatbot, and we have a long list of regulations (like in Constitutional AI, perhaps should not be trained to be part of the model for safety/legal/privacy concerns) to ask the model to conform with. Following one current popular deployment, we should tokenize this system-only message early and compute KV-cache correspondingly to save deployment costs. In that case, we cannot assume any user inquiry and we should get a properly formatted prompt even with system-only messages. We input this prompt to the model and then we can compute the KV cache.

@chujiezheng
Copy link
Owner

I understand the case of KV-cache. That is an excellent point. Thank you for sharing!

I just pushed the new implementation for simplification and unification. You may give it a try. But for the KV-cache case, one-by-one implementation is still needed for each model.

And once again, I do not recommend to pass

[
    {"role": "system", "content": "..."}
]

to the model. There may be unexpected outputs if no user inputs are provided.

@yangalan123
Copy link
Author

Great work! Thanks for your contribution! I think it would be better if you could write a gentle caution to the user that some unexpected outputs can happen in some cases :-)

Would be happy to create one PR (ninja is not within my expertise so I hesitate to do that, sorry) or you can just write it. I actually think our discussion here has a lot of values for both research and application purposes.

@chujiezheng
Copy link
Owner

Thanks for your advice. I will add this comment :)

@chujiezheng chujiezheng added the use case discussion Discussion on use cases label May 28, 2024
@yangalan123
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case discussion Discussion on use cases
Projects
None yet
Development

No branches or pull requests

2 participants