Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never exit the main loop in interactive mode. #297

Closed
wants to merge 6 commits into from

Conversation

tjohnman
Copy link
Contributor

If the end of stream token mark is found, when in interactive mode, ask for user input instead of exiting the main loop.

In case of running out of token budget, reset it and ask for user input.

With these changes, embd can end up empty and cause a crash in the next iteration of the loop, so we check for its size as well.

Johnman added 6 commits March 19, 2023 17:10
If the end of stream token mark is found, when in interactive mode, ask
for user input instead of exiting the main loop.

In case of running out of token budget, reset it and ask for user input.

With these changes, embd can end up empty and cause a crash in the next
iteration of the loop, so we check for its size as well.
@rabidcopy
Copy link
Contributor

rabidcopy commented Mar 19, 2023

So if I'm understanding this right, this PR handles both the end of text token AND the exit that occurs when you run out of tokens/context? Where exactly is the "memory" left after that? Does it reset and behave like you just started your initial prompt again with no user input/response history or? Sorry if this is obvious. Edit: Very useful nonetheless. Seems to work. Beats having to reload the model and have the initial prompt parsed again.

@tjohnman
Copy link
Contributor Author

So if I'm understanding this right, this PR handles both the end of text token AND the exit that occurs when you run out of tokens/context? Where exactly is the "memory" left after that? Does it reset and behave like you just started your initial prompt again with no user input/response history or? Sorry if this is obvious. Edit: Very useful nonetheless. Seems to work. Beats having to reload the model and have the initial prompt parsed again.

It shouldn’t affect the memory. It just resets the counter that holds how many tokens it can generate before reaching the maximum specified in the parameters.

However, I did a mess with the commits. Perhaps this pull request should be rejected and done properly again. I don’t know if the mess can be fixed manually now.

I have little experience with pull requests, sorry.

@tjohnman
Copy link
Contributor Author

Yep. I messed up and included changes from other stuff I was working on.

@tjohnman tjohnman closed this Mar 19, 2023
@rabidcopy
Copy link
Contributor

rabidcopy commented Mar 19, 2023

This is a bit sloppy and hacked together and the embd.back() = 13; last_n_tokens.back() = 13; part may be unnecessary or even harmful at best. Mostly just a personal tweak that to ignore end of text tokens and running out of tokens while continuing generation. Posting for myself in the future.

        if (params.interactive) {
            if (embd.size() && embd.back() == 2) {
                fprintf(stderr, " [end of text]\n");
//                is_interacting = true;
//                embd.back() = 13;
//                last_n_tokens.back() = 13;
            }
            if (remaining_tokens == 0) {
                fprintf(stderr, " [0 tokens remaining]\n");
                remaining_tokens = params.n_predict;
//                is_interacting = true;
//                embd.back() = 13;
//                last_n_tokens.back() = 13;
            }
        } else {
            // end of text token
            if (embd.size() && embd.back() == 2) {
                fprintf(stderr, " [end of text]\n");
                break;
            }
        }

@tjohnman
Copy link
Contributor Author

Made a proper pull request with just the necessary changes.
#298

@tjohnman tjohnman deleted the eternal-interactive-mode branch March 20, 2023 15:36
Deadsg pushed a commit to Deadsg/llama.cpp that referenced this pull request Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants