Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running #25

Open
MonstaCado opened this issue Dec 29, 2024 · 0 comments

Comments

@MonstaCado
Copy link

Hello :)

LLM is shut down when context is full, instead of cleaning the context to 0 or to n_keep to keep it running.

When the context gets full, the LLM is shut down. It can be seem by checking the vram used by the system (it stops using any vram at all), and checking the "gd_llama.is_running()" and "gd_llama.is_waiting_input()", which both returning false.

Tried multiple combinations with the following parameters, and doesn't seem to work with any combination:
gd_llama.n_keep, gd_llama.instruct, gd_llama.interactive, gd_llama.context_size, gd_llama.n_predict.

Having the n_keep parameter suggests that a feature of having the llm keep running when the context gets full should already be implemented, so I suppose this is a bug.

Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant