Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a CLI option in main.cpp in order to stop generation at newline token #6441

Closed
WilliamTambellini opened this issue Apr 2, 2024 · 5 comments
Labels
enhancement New feature or request stale

Comments

@WilliamTambellini
Copy link
Contributor

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [X ] I carefully followed the README.md.
  • [X ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [X ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

Just add a CLI option in main.cpp in order to stop generation at newline token

Motivation

For some users like us, the goal is not to chat/have a discussion with a LLM, just to get a single line of output.

Possible Implementation

in main.cpp: eg:

        if (params.stop_at_nl && output_tokens.size() > 0 && !embd.empty() && embd.back() == llama_token_nl(model)) {
            break; // only generate 1 single output line
        }
@WilliamTambellini WilliamTambellini added the enhancement New feature or request label Apr 2, 2024
@Jeximo
Copy link
Contributor

Jeximo commented Apr 3, 2024

-e -r "\n" on CLI should stop generation at a newline.

-e, --escape ... process prompt escapes sequences (\n, \r, \t, \', \", \\)

- `-r PROMPT, --reverse-prompt PROMPT`: Specify one or multiple reverse prompts to pause text generation and switch to interactive mode. For example, `-r "User:"` can be used to jump back into the conversation whenever it's the user's turn to speak. This helps create a more interactive and conversational experience. However, the reverse prompt doesn't work when it ends with a space.

@HanClinto
Copy link
Collaborator

I think that @Jeximo has the best solution, but an alternate approach would be to utilize grammars.

--grammar '[^\n]+'

This should force the LLM to generate anything except for newline characters, keeping everything on a single line.

@WilliamTambellini
Copy link
Contributor Author

Tks @Jeximo but sadly that does not do exactly what I need: a single line non-empty output:
-e -r "\n" does stop at the newline char but does not enforce the output to be non empty. That s why I added "output_tokens.size() > 0" in my implementation.
Now, perhaps "-e -r \n" could do the job if there is a way to force main to generate a non empty output. Is there ?

Tks @HanClinto but "--grammar '[^\n]+'" is not the desired feature: I still want to authorize the llm to 'softmax' a newline char but main to stop generating when the LLM advices a newline, not to continue.

@HanClinto
Copy link
Collaborator

HanClinto commented Apr 3, 2024

Now, perhaps "-e -r \n" could do the job if there is a way to force main to generate a non empty output. Is there ?

For this, yes -- you can use grammars to require at least one non-newline character before generating a newline.

This worked for me:

./main -m ./models/llama-2-7b.Q4_0.gguf -e -r "\n" --grammar "root ::= [^\n][^\x00]*" -p "My favorite flavor is "

This grammar requires at least one non-newline character, followed one or more of any other character. The reason why I used [^\x00] instead of . is that it seems like this symbol might not be implemented in the grammar parser -- I intend to look into this -- it might simply be my own issue. The intended grammar (if I could get "." to work) would be:
root ::= [^\n].*

Hopefully that helps you visualize what the chosen grammar is doing.

I realize this isn't as clean as your suggested solution -- it's always a tough balance of too much clutter in the command-line options, vs. making it easier to do common tasks. I think if enough other people found this to be a common use-case, then the project might want to support it as a first-class command-line option.

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants