feat(anthropic): improve prompt caching and type safety #317

ErikBjare · 2024-12-10T20:27:48Z

Costs were looking a bit expensive, realized that the prompt caching implementation is pretty sub-par and doesn't cache longer contexts when they appear in the conversation (only system messages).

Add better type hints for Anthropic API types
Optimize prompt caching with ephemeral cache control
Extract message preparation into dedicated function
Improve documentation

- Add better type hints for Anthropic API types - Optimize prompt caching with ephemeral cache control - Extract message preparation into dedicated function - Improve documentation

ellipsis-dev

👍 Looks good to me! Reviewed everything up to 9c77d71 in 1 minute and 2 seconds

More details

Looked at 220 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 6 drafted comments based on config settings.

1. gptme/llm/llm_anthropic.py:19

Draft comment:
The import 'import anthropic.types' is unused and can be removed to clean up the code.
Reason this comment was not posted:
Confidence changes required: 10%
The import statement for 'anthropic.types' is not used in the file. It should be removed to clean up the code.

2. gptme/llm/llm_anthropic.py:20

Draft comment:
The import 'import anthropic.types.beta.prompt_caching' is unused and can be removed to clean up the code.
Reason this comment was not posted:
Confidence changes required: 10%
The import statement for 'anthropic.types.beta.prompt_caching' is not used in the file. It should be removed to clean up the code.

3. gptme/llm/llm_anthropic.py:21

Draft comment:
The import 'from anthropic import Anthropic' is unused and can be removed to clean up the code.
Reason this comment was not posted:
Confidence changes required: 10%
The import statement for 'anthropic' is not used in the file. It should be removed to clean up the code.

4. gptme/llm/llm_anthropic.py:202

Draft comment:
The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality.
Reason this comment was not posted:
Confidence changes required: 20%
The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.

5. gptme/llm/llm_anthropic.py:208

Draft comment:
The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality.
Reason this comment was not posted:
Confidence changes required: 20%
The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.

6. gptme/llm/llm_anthropic.py:212

Draft comment:
The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality.
Reason this comment was not posted:
Confidence changes required: 20%
The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.

Workflow ID: wflow_5z9GMKNg9BR8tfAH

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

codecov-commenter · 2024-12-10T20:30:35Z

Codecov Report

Attention: Patch coverage is 95.65217% with 1 line in your changes missing coverage. Please review.

Project coverage is 73.44%. Comparing base (c0eb21f) to head (9c77d71).

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
gptme/llm/llm_anthropic.py	95.65%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #317      +/-   ##
==========================================
- Coverage   73.56%   73.44%   -0.13%     
==========================================
  Files          68       68              
  Lines        4975     4978       +3     
==========================================
- Hits         3660     3656       -4     
- Misses       1315     1322       +7

Flag	Coverage Δ
anthropic/claude-3-haiku-20240307	`71.63% <95.65%> (-0.03%)`	⬇️
openai/gpt-4o-mini	`70.28% <17.39%> (-0.19%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ErikBjare · 2024-12-10T20:32:47Z

gptme/llm/llm_anthropic.py

+
+        for part in raw_content:
+            if isinstance(part, dict):
+                if part.get("type") == "text" and i == len(messages_dicts) - 1:


Might want to review this later to cache at the optimal locations, especially considering RAG and "live" messages. Should probably also add a cache step at the system prompt.

feat(anthropic): improve prompt caching and type safety

9c77d71

- Add better type hints for Anthropic API types - Optimize prompt caching with ephemeral cache control - Extract message preparation into dedicated function - Improve documentation

ellipsis-dev bot reviewed Dec 10, 2024

View reviewed changes

ErikBjare commented Dec 10, 2024

View reviewed changes

ErikBjare merged commit a2f06df into master Dec 10, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(anthropic): improve prompt caching and type safety #317

feat(anthropic): improve prompt caching and type safety #317

ErikBjare commented Dec 10, 2024 •

edited

Loading

ellipsis-dev bot left a comment

codecov-commenter commented Dec 10, 2024 •

edited

Loading

ErikBjare Dec 10, 2024

feat(anthropic): improve prompt caching and type safety #317

feat(anthropic): improve prompt caching and type safety #317

Conversation

ErikBjare commented Dec 10, 2024 • edited Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 10, 2024 • edited Loading

Codecov Report

ErikBjare Dec 10, 2024

Choose a reason for hiding this comment

ErikBjare commented Dec 10, 2024 •

edited

Loading

codecov-commenter commented Dec 10, 2024 •

edited

Loading