Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epic: add cache_prompt to Engines Settings #1570

Open
1 of 2 tasks
Van-QA opened this issue Aug 9, 2024 · 4 comments
Open
1 of 2 tasks

epic: add cache_prompt to Engines Settings #1570

Van-QA opened this issue Aug 9, 2024 · 4 comments
Assignees
Labels
type: epic A major feature or initiative

Comments

@Van-QA
Copy link
Contributor

Van-QA commented Aug 9, 2024

Jan does not support setting cache_prompt in the HTTP request JSON for llama.cpp - resulting in slower processing times for long contexts (8000+ tokens).

Describe the solution

  • Ja should support setting the cache_prompt parameter in the HTTP request JSON to enable faster processing times with llama.cpp.

  • By using Anthropic prompt caching, longer chats with a lot of context are substantially cheaper:
    https://www.anthropic.com/news/prompt-caching

What is the motivation / use case for changing the behavior?

Currently, the default setting for cache_prompt is off in llama.cpp - leading to significant delays. Manually enabling cache_prompt improves performance.

@Van-QA Van-QA added the type: feature request A new feature label Aug 9, 2024
@Van-QA Van-QA changed the title feat: Engines Settings pages - e.g. llama.cpp, etc epic: Engines Settings pages - e.g. llama.cpp, etc Aug 9, 2024
@Van-QA Van-QA added type: epic A major feature or initiative and removed type: feature request A new feature labels Aug 9, 2024
@louis-menlo
Copy link
Contributor

@imtuyethan - For the specs.

@freelerobot freelerobot changed the title epic: Engines Settings pages - e.g. llama.cpp, etc epic: add cache_prompt to Engines Settings Sep 5, 2024
@freelerobot
Copy link
Contributor

related: janhq/jan#3140

@imtuyethan
Copy link
Contributor

Latest related request: janhq/jan#3715

@dan-menlo
Copy link
Contributor

I am transferring this to Cortex, as part of our Llama.cpp integration settings

@github-project-automation github-project-automation bot moved this to Investigating in Menlo Oct 29, 2024
@dan-menlo dan-menlo transferred this issue from janhq/jan Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: epic A major feature or initiative
Projects
Status: Investigating
Development

No branches or pull requests

6 participants