-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(anthropic): improve prompt caching and type safety #317
Conversation
- Add better type hints for Anthropic API types - Optimize prompt caching with ephemeral cache control - Extract message preparation into dedicated function - Improve documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 9c77d71 in 1 minute and 2 seconds
More details
- Looked at
220
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
6
drafted comments based on config settings.
1. gptme/llm/llm_anthropic.py:19
- Draft comment:
The import 'import anthropic.types' is unused and can be removed to clean up the code. - Reason this comment was not posted:
Confidence changes required:10%
The import statement for 'anthropic.types' is not used in the file. It should be removed to clean up the code.
2. gptme/llm/llm_anthropic.py:20
- Draft comment:
The import 'import anthropic.types.beta.prompt_caching' is unused and can be removed to clean up the code. - Reason this comment was not posted:
Confidence changes required:10%
The import statement for 'anthropic.types.beta.prompt_caching' is not used in the file. It should be removed to clean up the code.
3. gptme/llm/llm_anthropic.py:21
- Draft comment:
The import 'from anthropic import Anthropic' is unused and can be removed to clean up the code. - Reason this comment was not posted:
Confidence changes required:10%
The import statement for 'anthropic' is not used in the file. It should be removed to clean up the code.
4. gptme/llm/llm_anthropic.py:202
- Draft comment:
The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality. - Reason this comment was not posted:
Confidence changes required:20%
The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.
5. gptme/llm/llm_anthropic.py:208
- Draft comment:
The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality. - Reason this comment was not posted:
Confidence changes required:20%
The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.
6. gptme/llm/llm_anthropic.py:212
- Draft comment:
The docstring mentions cache control logic that was removed. Update the docstring to reflect the current functionality. - Reason this comment was not posted:
Confidence changes required:20%
The function '_transform_system_messages' has a detailed docstring, but it doesn't mention the cache control logic that was removed. The docstring should be updated to reflect the current functionality.
Workflow ID: wflow_5z9GMKNg9BR8tfAH
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## master #317 +/- ##
==========================================
- Coverage 73.56% 73.44% -0.13%
==========================================
Files 68 68
Lines 4975 4978 +3
==========================================
- Hits 3660 3656 -4
- Misses 1315 1322 +7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
||
for part in raw_content: | ||
if isinstance(part, dict): | ||
if part.get("type") == "text" and i == len(messages_dicts) - 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to review this later to cache at the optimal locations, especially considering RAG and "live" messages. Should probably also add a cache step at the system prompt.
Costs were looking a bit expensive, realized that the prompt caching implementation is pretty sub-par and doesn't cache longer contexts when they appear in the conversation (only system messages).