Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anthropic): improve prompt caching and type safety #317

Merged
merged 1 commit into from
Dec 10, 2024

Conversation

ErikBjare
Copy link
Owner

@ErikBjare ErikBjare commented Dec 10, 2024

Costs were looking a bit expensive, realized that the prompt caching implementation is pretty sub-par and doesn't cache longer contexts when they appear in the conversation (only system messages).

  • Add better type hints for Anthropic API types
  • Optimize prompt caching with ephemeral cache control
  • Extract message preparation into dedicated function
  • Improve documentation

- Add better type hints for Anthropic API types
- Optimize prompt caching with ephemeral cache control
- Extract message preparation into dedicated function
- Improve documentation
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 9c77d71 in 1 minute and 2 seconds

More details
  • Looked at 220 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 drafted comments based on config settings.
1. gptme/llm/llm_anthropic.py:19
  • Draft comment:
    The import 'import anthropic.types' is unused and can be removed to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The import statement for 'anthropic.types' is not used in the file. It should be removed to clean up the code.
2. gptme/llm/llm_anthropic.py:20
  • Draft comment:
    The import 'import anthropic.types.beta.prompt_caching' is unused and can be removed to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The import statement for 'anthropic.types.beta.prompt_caching' is not used in the file. It should be removed to clean up the code.
3. gptme/llm/llm_anthropic.py:21
  • Draft comment:
    The import 'from anthropic import Anthropic' is unused and can be removed to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The import statement for 'anthropic' is not used in the file. It should be removed to clean up the code.
4. gptme/llm/llm_anthropic.py:202
  • Draft comment:
    The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.
5. gptme/llm/llm_anthropic.py:208
  • Draft comment:
    The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.
6. gptme/llm/llm_anthropic.py:212
  • Draft comment:
    The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.

Workflow ID: wflow_5z9GMKNg9BR8tfAH


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@codecov-commenter
Copy link

codecov-commenter commented Dec 10, 2024

Codecov Report

Attention: Patch coverage is 95.65217% with 1 line in your changes missing coverage. Please review.

Project coverage is 73.44%. Comparing base (c0eb21f) to head (9c77d71).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
gptme/llm/llm_anthropic.py 95.65% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #317      +/-   ##
==========================================
- Coverage   73.56%   73.44%   -0.13%     
==========================================
  Files          68       68              
  Lines        4975     4978       +3     
==========================================
- Hits         3660     3656       -4     
- Misses       1315     1322       +7     
Flag Coverage Δ
anthropic/claude-3-haiku-20240307 71.63% <95.65%> (-0.03%) ⬇️
openai/gpt-4o-mini 70.28% <17.39%> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


for part in raw_content:
if isinstance(part, dict):
if part.get("type") == "text" and i == len(messages_dicts) - 1:
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to review this later to cache at the optimal locations, especially considering RAG and "live" messages. Should probably also add a cache step at the system prompt.

@ErikBjare ErikBjare merged commit a2f06df into master Dec 10, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants