-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a CLI option in main.cpp in order to stop generation at newline token #6441
Comments
llama.cpp/examples/main/README.md Line 104 in f87f7b8
|
I think that @Jeximo has the best solution, but an alternate approach would be to utilize grammars.
This should force the LLM to generate anything except for newline characters, keeping everything on a single line. |
Tks @Jeximo but sadly that does not do exactly what I need: a single line non-empty output: Tks @HanClinto but "--grammar '[^\n]+'" is not the desired feature: I still want to authorize the llm to 'softmax' a newline char but main to stop generating when the LLM advices a newline, not to continue. |
For this, yes -- you can use grammars to require at least one non-newline character before generating a newline. This worked for me:
This grammar requires at least one non-newline character, followed one or more of any other character. The reason why I used Hopefully that helps you visualize what the chosen grammar is doing. I realize this isn't as clean as your suggested solution -- it's always a tough balance of too much clutter in the command-line options, vs. making it easier to do common tasks. I think if enough other people found this to be a common use-case, then the project might want to support it as a first-class command-line option. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Just add a CLI option in main.cpp in order to stop generation at newline token
Motivation
For some users like us, the goal is not to chat/have a discussion with a LLM, just to get a single line of output.
Possible Implementation
in main.cpp: eg:
The text was updated successfully, but these errors were encountered: