Community: gather token usage info in BedrockChat during generation #19127

dmenini · 2024-03-15T10:49:47Z

This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily.

This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see OpenAICallbackHandler. I plan on adding a BedrockCallbackHandler later.
Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner.

Thanks for your reviews. FYI @baskaryan, @hwchase17

…it in LLMOutput

vercel · 2024-03-15T10:49:51Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Mar 28, 2024 6:50pm

esoler-sage · 2024-03-20T08:50:38Z

Hey! For Bedrock you may want to use the tokens returned in the headers "x-amzn-bedrock-*", like

x-amzn-bedrock-output-token-count
x-amzn-bedrock-input-token-count
instead of using anthropic only tokenizer 🤔?

dmenini · 2024-03-20T12:46:56Z

Hey! For Bedrock you may want to use the tokens returned in the headers "x-amzn-bedrock-*", like

x-amzn-bedrock-output-token-count

x-amzn-bedrock-input-token-count
instead of using anthropic only tokenizer 🤔?

Thanks for this comment! I didn't know about those headers. It definitely makes sense and I'll rework my implementation accordingly

dmenini · 2024-03-21T13:45:07Z

The PR is updated! Now token counters are read from the headers in case of simple generation: The resulting LLMOutput will have the fields model_id and usage, containing the token counters.

NB In case of streaming, usage will be an empty dict as extracting the tokens is more complicated. Indeed, the headers are not available, and one would have to extract the token counters from the body, according to the model provider structure. Will do it at a later time :)

baskaryan · 2024-03-26T01:11:48Z

cc @3coins

pratik60 · 2024-03-28T08:11:06Z

libs/community/langchain_community/llms/bedrock.py

+                ),
+                "completion_tokens": int(
+                    headers.get("x-amzn-bedrock-output-token-count", 0)
+                ),


This is great! I was looking to achieve the same thing to enable cost monitoring on the Anthropic Bedrock models. Should we also add total_tokens (sum of prompt_tokens + completion_tokens), to keep it compatible with the OpenAI model?

Thanks for the comment! I added the total_count

Thanks! You have linting errors, fyi.

Hopefully one of the project maintainers has a chance to look at this, and get it merged soon!

@baskaryan

…ation (langchain-ai#19127) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <[email protected]> Co-authored-by: Bagatur <[email protected]>

Sukitly · 2024-04-01T10:16:05Z

Excellent work!
I'm curious about the differences between the use of "usage" in the response body versus in the headers. According to the Bedrock API documentation, the "usage" is included in the response body as follows:

{
    "id": string,
    "model": string,
    "type": "message",
    "role": "assistant",
    "content": [
        {
            "type": "text",
            "text": string
        }
    ],
    "stop_reason": string,
    "stop_sequence": string,
    "usage": {
        "input_tokens": integer,
        "output_tokens": integer
    }
    
}

Do you guys have any insights on this? @dmenini @esoler-sage @pratik60
Thanks!

esoler-sage · 2024-04-01T10:18:30Z

I think usage in headers is for sync calls, and async call' last chunk contains usage in body 🤔

Update: Also, the docs you were pointing are using the new messages api from Anthropic, only available on (some) Anthropic models

@baskaryan

…ation (#19127) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <[email protected]> Co-authored-by: Bagatur <[email protected]>

taamedag added 2 commits March 15, 2024 11:30

feat (community): gather token usage info in bedrock chat and return …

a3590a1

…it in LLMOutput

fix: lint

57a1cfe

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. Ɑ: models Related to LLMs or chat model modules 🔌: anthropic Primarily related to Anthropic integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Mar 15, 2024

dmenini changed the title ~~Improvement (community): gather token usage info in BedrockChat during generation~~ Community: gather token usage info in BedrockChat during generation Mar 15, 2024

feat: read tokens from response headers

78cfa02

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Mar 21, 2024

fix: undo line change

c1bbbac

baskaryan added 🔌: aws Primarily related to Amazon Web Services (AWS) integrations and removed 🔌: anthropic Primarily related to Anthropic integrations labels Mar 26, 2024

dmenini mentioned this pull request Mar 28, 2024

community: add token usage to bedrock models' responses if available #19448

Closed

pratik60 reviewed Mar 28, 2024

View reviewed changes

pratik60 mentioned this pull request Mar 28, 2024

Enable token usage count for AWS Bedrock #11906

Closed

taamedag and others added 6 commits March 28, 2024 13:27

feat: add total count

59de03b

fix: import sorting order

fdc1c6e

Merge branch 'master' into feature/count-tokens-in-bedrock-chat

270a931

fmt

f5f27fc

fmt

0040d3c

fmt

19b5445

baskaryan approved these changes Mar 28, 2024

View reviewed changes

dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Mar 28, 2024

fmt

eb6b821

baskaryan approved these changes Mar 28, 2024

View reviewed changes

baskaryan enabled auto-merge (squash) March 28, 2024 18:51

baskaryan merged commit f704232 into langchain-ai:master Mar 28, 2024
59 checks passed

sepiatone mentioned this pull request Apr 22, 2024

Standardized token usage information #20524

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Community: gather token usage info in BedrockChat during generation #19127

Community: gather token usage info in BedrockChat during generation #19127

dmenini commented Mar 15, 2024 •

edited

Loading

vercel bot commented Mar 15, 2024 •

edited

Loading

esoler-sage commented Mar 20, 2024

dmenini commented Mar 20, 2024

dmenini commented Mar 21, 2024

baskaryan commented Mar 26, 2024

pratik60 Mar 28, 2024

dmenini Mar 28, 2024

pratik60 Mar 28, 2024 •

edited

Loading

Sukitly commented Apr 1, 2024

esoler-sage commented Apr 1, 2024 •

edited

Loading

Community: gather token usage info in BedrockChat during generation #19127

Community: gather token usage info in BedrockChat during generation #19127

Conversation

dmenini commented Mar 15, 2024 • edited Loading

vercel bot commented Mar 15, 2024 • edited Loading

esoler-sage commented Mar 20, 2024

dmenini commented Mar 20, 2024

dmenini commented Mar 21, 2024

baskaryan commented Mar 26, 2024

pratik60 Mar 28, 2024

Choose a reason for hiding this comment

dmenini Mar 28, 2024

Choose a reason for hiding this comment

pratik60 Mar 28, 2024 • edited Loading

Choose a reason for hiding this comment

Sukitly commented Apr 1, 2024

esoler-sage commented Apr 1, 2024 • edited Loading

dmenini commented Mar 15, 2024 •

edited

Loading

vercel bot commented Mar 15, 2024 •

edited

Loading

pratik60 Mar 28, 2024 •

edited

Loading

esoler-sage commented Apr 1, 2024 •

edited

Loading