-
Notifications
You must be signed in to change notification settings - Fork 16.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Community: gather token usage info in BedrockChat during generation #19127
Community: gather token usage info in BedrockChat during generation #19127
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
Hey! For Bedrock you may want to use the tokens returned in the headers "x-amzn-bedrock-*", like
|
Thanks for this comment! I didn't know about those headers. It definitely makes sense and I'll rework my implementation accordingly |
The PR is updated! Now token counters are read from the headers in case of simple generation: The resulting LLMOutput will have the fields NB In case of streaming, |
cc @3coins |
), | ||
"completion_tokens": int( | ||
headers.get("x-amzn-bedrock-output-token-count", 0) | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I was looking to achieve the same thing to enable cost monitoring on the Anthropic Bedrock models. Should we also add total_tokens (sum of prompt_tokens + completion_tokens), to keep it compatible with the OpenAI model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! I added the total_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! You have linting errors, fyi.
Hopefully one of the project maintainers has a chance to look at this, and get it merged soon!
…ation (langchain-ai#19127) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <[email protected]> Co-authored-by: Bagatur <[email protected]>
Excellent work! {
"id": string,
"model": string,
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": string
}
],
"stop_reason": string,
"stop_sequence": string,
"usage": {
"input_tokens": integer,
"output_tokens": integer
}
} Do you guys have any insights on this? @dmenini @esoler-sage @pratik60 |
I think usage in headers is for sync calls, and async call' last chunk contains usage in body 🤔 Update: Also, the docs you were pointing are using the new messages api from Anthropic, only available on (some) Anthropic models |
…ation (#19127) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <[email protected]> Co-authored-by: Bagatur <[email protected]>
This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily.
This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see OpenAICallbackHandler. I plan on adding a BedrockCallbackHandler later.
Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner.
Thanks for your reviews. FYI @baskaryan, @hwchase17