LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running #25

MonstaCado · 2024-12-29T21:18:36Z

Hello :)

LLM is shut down when context is full, instead of cleaning the context to 0 or to n_keep to keep it running.

When the context gets full, the LLM is shut down. It can be seem by checking the vram used by the system (it stops using any vram at all), and checking the "gd_llama.is_running()" and "gd_llama.is_waiting_input()", which both returning false.

Tried multiple combinations with the following parameters, and doesn't seem to work with any combination:
gd_llama.n_keep, gd_llama.instruct, gd_llama.interactive, gd_llama.context_size, gd_llama.n_predict.

Having the n_keep parameter suggests that a feature of having the llm keep running when the context gets full should already be implemented, so I suppose this is a bug.

Thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running #25

LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running #25

MonstaCado commented Dec 29, 2024

LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running #25

LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running #25

Comments

MonstaCado commented Dec 29, 2024