epic: add cache_prompt to Engines Settings #1570

Van-QA · 2024-08-09T08:08:45Z

Jan does not support setting cache_prompt in the HTTP request JSON for llama.cpp - resulting in slower processing times for long contexts (8000+ tokens).

Describe the solution

Ja should support setting the cache_prompt parameter in the HTTP request JSON to enable faster processing times with llama.cpp.
By using Anthropic prompt caching, longer chats with a lot of context are substantially cheaper:
https://www.anthropic.com/news/prompt-caching

What is the motivation / use case for changing the behavior?

Currently, the default setting for cache_prompt is off in llama.cpp - leading to significant delays. Manually enabling cache_prompt improves performance.

louis-menlo · 2024-08-28T03:20:58Z

@imtuyethan - For the specs.

freelerobot · 2024-09-05T09:55:34Z

related: janhq/jan#3140

imtuyethan · 2024-10-18T07:58:19Z

Latest related request: janhq/jan#3715

dan-menlo · 2024-10-29T08:41:52Z

I am transferring this to Cortex, as part of our Llama.cpp integration settings

Van-QA added the type: feature request A new feature label Aug 9, 2024

Van-QA assigned louis-menlo Aug 9, 2024

Van-QA changed the title ~~feat: Engines Settings pages - e.g. llama.cpp, etc~~ epic: Engines Settings pages - e.g. llama.cpp, etc Aug 9, 2024

Van-QA added type: epic A major feature or initiative and removed type: feature request A new feature labels Aug 9, 2024

louis-menlo assigned imtuyethan Aug 28, 2024

This was referenced Aug 28, 2024

feat: Jan supports cache_prompt (HTTP request JSON for llama.cpp) janhq/jan#3253

Closed

feat: Anthropic prompt caching janhq/jan#3372

Closed

imtuyethan unassigned imtuyethan and louis-menlo Aug 28, 2024

freelerobot changed the title ~~epic: Engines Settings pages - e.g. llama.cpp, etc~~ epic: add cache_prompt to Engines Settings Sep 5, 2024

Boscop mentioned this issue Sep 29, 2024

idea: Add Claude Prompt Caching Support to Jan janhq/jan#3715

Closed

imtuyethan mentioned this issue Oct 18, 2024

chore: Structure Icebox in Github Projects janhq/jan#3840

Closed

dan-menlo mentioned this issue Oct 29, 2024

roadmap: Jan has revamped Remote Engines (e.g. OpenAI, Anthropic etc) janhq/jan#3786

Closed

25 tasks

dan-menlo assigned nguyenhoangthuan99 Oct 29, 2024

github-project-automation bot added this to Menlo Oct 29, 2024

github-project-automation bot moved this to Investigating in Menlo Oct 29, 2024

dan-menlo transferred this issue from janhq/jan Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epic: add cache_prompt to Engines Settings #1570

epic: add cache_prompt to Engines Settings #1570

Van-QA commented Aug 9, 2024 •

edited by imtuyethan

Loading

louis-menlo commented Aug 28, 2024

freelerobot commented Sep 5, 2024

imtuyethan commented Oct 18, 2024

dan-menlo commented Oct 29, 2024

epic: add cache_prompt to Engines Settings #1570

epic: add cache_prompt to Engines Settings #1570

Comments

Van-QA commented Aug 9, 2024 • edited by imtuyethan Loading

Describe the solution

What is the motivation / use case for changing the behavior?

louis-menlo commented Aug 28, 2024

freelerobot commented Sep 5, 2024

imtuyethan commented Oct 18, 2024

dan-menlo commented Oct 29, 2024

Van-QA commented Aug 9, 2024 •

edited by imtuyethan

Loading