Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core+partners/anthropic: Anthropic prompt caching #25644

Conversation

mrdrprofuroboros
Copy link

Description: Added support for Anthropic prompt caching, see #25625
Issue: the issue # it fixes, if applicable
Dependencies: bump anthropic>=0.34.0

  • just found that it fails the test
    will fix it and add usage example to the notebook, so far here:
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage

model = ChatAnthropic(model="claude-3-5-sonnet-20240620", beta=True)
model.invoke([
    SystemMessage([{
        "type": "text",
        "text": "foo"*1000,
        "cache_control": {"type": "ephemeral"}
    }]),
    HumanMessage("hi!"),
])
AIMessage(content='Hello! How can I assist you today?', response_metadata={'id': 'msg_01XnYziv7oZtaRw23d45ivSi', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'cache_creation_input_tokens': 1500, 'cache_read_input_tokens': 0, 'input_tokens': 9, 'output_tokens': 12}}, id='run-31a51a4f-bb0c-4b24-9c75-d7f7e5f99a89-0', usage_metadata={'input_tokens': 9, 'output_tokens': 12, 'total_tokens': 21, 'cache_creation_input_tokens': 1500, 'cache_read_input_tokens': 0})

@efriis efriis added the partner label Aug 21, 2024
@efriis efriis self-assigned this Aug 21, 2024
Copy link

vercel bot commented Aug 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Aug 22, 2024 3:31am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Ɑ: core Related to langchain-core 🔌: anthropic Primarily related to Anthropic integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Aug 21, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Aug 22, 2024
@mrdrprofuroboros
Copy link
Author

@efriis oof, main is running ahead real fast. my changes passed tests / lint before I decided to bump up with "Update branch". Would you mind taking a look and helping to figure out what are the next steps or how can I improve my PR to get it merged?

Copy link
Collaborator

@baskaryan baskaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the contribution! i'm very down for including cache token usage in ChatAnthropic outputs but think we'll want to make sure we do it in a future-proof/generalizable way

@@ -51,6 +51,10 @@ class UsageMetadata(TypedDict):
"""Count of output (or completion) tokens."""
total_tokens: int
"""Total token count."""
cache_creation_input_tokens: NotRequired[int]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we want to add this to core until at least one or two other providers support a similar feature

@property
def _messages_client(self) -> Messages:
if self.beta:
return self._client.beta.prompt_caching.messages # type: ignore[attr-defined]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels more specific than just a "beta" flag indicates. are we going to update client to beta.{x}.messages every time there's a new beta feature?

also is cache usage not returned if you use the regular client with the beta headers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice, it actually works. here's an example:

from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage

model = ChatAnthropic(
    model="claude-3-opus-20240229",
    temperature=0,
    extra_headers={"anthropic-beta": "prompt-caching-2024-07-31"}
)

chat = [
    SystemMessage([{
        "type": "text",
        "text": "foo"*1000,
        "cache_control": {"type": "ephemeral"},
    }]),
    HumanMessage("Hi"),
]

model.invoke(chat)

returning

AIMessage(content='Hello! How can I assist you today?', response_metadata={'id': 'msg_01EuihUPN9JrbzZXuZd6oEu8', 'model': 'claude-3-opus-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 8, 'output_tokens': 12, 'cache_creation_input_tokens': 1500, 'cache_read_input_tokens': 0}}, id='run-a13ecd02-d669-4028-b8a2-56e5113d2417-0', usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there way to include variables in the system prompt but still includes the "cache_control": {"type": "ephemeral"} parameter?

@baskaryan baskaryan removed the Ɑ: core Related to langchain-core label Aug 23, 2024
@efriis efriis assigned baskaryan and unassigned efriis Aug 24, 2024
@mrdrprofuroboros mrdrprofuroboros deleted the anthropic-prompt-cache branch October 14, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔌: anthropic Primarily related to Anthropic integrations 🤖:improvement Medium size change to existing code to handle new use-cases partner size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants