-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: anthropic cache usage #25684
rfc: anthropic cache usage #25684
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
One more thing I'm facing right now is PropmtTemplate - I can't find a way to use it with Anthropic cache 'cause it wipes out the 'cache_control' attribute from the text block.
Here it just extracts text and throws all the rest away Are there any ideas / workarounds? |
thinking of adding support for something like this, any thoughts? #25674 |
lgtm |
Hi @mrdrprofuroboros , could you share your approach with |
Oh, sorry, I missed your message def to_cached(content: str | List[Dict[str, Any]]) -> List[Dict[str, Any]]:
if isinstance(content, str):
return [{
"type": "text",
"text": content,
"cache_control": {"type": "ephemeral"},
}]
if isinstance(content[-1], str):
content[-1] = {
"type": "text",
"text": content[-1],
"cache_control": {"type": "ephemeral"},
}
else:
content[-1]["cache_control"] = {"type": "ephemeral"}
return content
def add_cache_control(compiled_chat: ChatPromptValue) -> ChatPromptValue:
"""
Anthropic supports maximum 4 blocks with cache_control so we'll set
- 1 on tools
- 1 on the first and last system block
- 1 on the last even 5th message
"""
system_messages = []
other_messages = []
for message in compiled_chat.messages:
if isinstance(message, SystemMessage):
if not system_messages:
system_messages.append(SystemMessage(to_cached(message.content)))
else:
system_messages[0].content.append(message.content)
else:
other_messages.append(deepcopy(message))
if system_messages:
system_messages[0].content = to_cached(system_messages[0].content)
messages = system_messages + other_messages
last_cached = (len(messages) - 2) // 5 * 5 # never set on the last message since it's constantly changing
messages[last_cached].content = to_cached(messages[last_cached].content)
return ChatPromptValue(messages=messages)
### and then
if isinstance(model, ChatAnthropic):
model = RunnableLambda(add_cache_control) | model |
Hey @baskaryan, can we merge this branch? We can explicitly warn that this data is not standardized and one could use it on their own risk of deprecation |
closing in favor of #27087 |
will have a standardized format out in #27087! should land and release later today |
That’s great news, thank you! |
out in langchain-anthropic |
Sorry 0.2.3! Had to patch so that usage_metadata['input_tokens'] was a sum of all input tokens, including cache read and cache creation tokens |
Alternative to #25644
What do we want to do with anthropic cache token counts and UsageMetadata?