Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add safeguard on tokens returned by functions #576

Merged
merged 7 commits into from
Dec 14, 2023
Merged

Conversation

cpacker
Copy link
Collaborator

@cpacker cpacker commented Dec 4, 2023

Closes #553


Please describe the purpose of this pull request

We currently do not check the size of function responses to make sure they are under X tokens.

This is not a big deal for the base function set, but it a problem for custom functions (eg the HTTP request in extras) which can return huge responses. This will trigger a summarization lock event and seems to have the potential to "brick" agents by nuking their message state.

  • Should patch by having a max function response length based on the context length, and running a smart string clipper / ellipses on the returns
  • Should also update the custom functions documentation to discuss manually managing string/dict return sizes based on context length

How to test

Modify http_request to return an arbitrarily long response:

def http_request(self, method: str, url: str, payload_json: Optional[str] = None):
    """
    Generates an HTTP request and returns the response.

    Args:
        method (str): The HTTP method (e.g., 'GET', 'POST').
        url (str): The URL for the request.
        payload_json (Optional[str]): A JSON string representing the request payload.

    Returns:
        dict: The response from the HTTP request.
    """

    return {"status_code": 400, "headers": None, "body": " ".join(["is was a lonely day along the wide path"] * 5000)}

Side note: if you use something like " ".join(["TEST"] * 5000) you'll get a 400 error from OpenAI because it identifies "TEST" as a specific bad request input

Demonstrating the bug

On main, if you override a function call like the HTTP requests call to return an arbitrary large text file, it'll brick the agent by causing a summarize chain that cannot complete:

💭 User just logged in. He's never interacted with this persona before. Let's greet him.
🤖 Greetings, Chad! Delighted to make your virtual acquaintance. You must have fascinating stories to share. What has sparked your curiosity recently?
⚡🟢 [function] Success: None
last response total_tokens (2637) < 6144.0
LocalStateManager.append_to_messages
> Enter your message: Can you try using the http_request function?
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
response = {'id': '...', 'object': 'chat.completion', 'created': 1701730353, 'model': 'gpt-4-0613', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "User wants 
me to make an HTTP request. I don't have any specific URL or service in mind, perhaps I could fetch a piece of interesting information from an API for testing.", 'function_call': {'name': 'http_request', 'arguments':
'{\n  "method": "GET",\n  "url": "https://api.publicapis.org/entries",\n  "request_heartbeat": true\n}'}}, 'finish_reason': 'function_call'}], 'usage': {'prompt_tokens': 2719, 'completion_tokens': 74, 'total_tokens':
2793}, 'system_fingerprint': None}
💭 User wants me to make an HTTP request. I don't have any specific URL or service in mind, perhaps I could fetch a piece of interesting information from an API for testing.
⚡🟢 [function] Success: {'status_code': 400, 'headers': None, 'body': 'is was a lonely day along the w....

...

last response total_tokens (2793) < 6144.0
LocalStateManager.append_to_messages
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
Got HTTPError, exception=400 Client Error: Bad Request for url: https://api.openai.com/v1/chat/completions, payload={'model': 'gpt-4', 'messages': [{'role': 'system', 'content': 'You are MemGPT, ...
...
 response={'error': {'message': "This model's maximum context length is 8192 
tokens. However, your messages resulted in 47817 tokens (46787 in the messages, 1030 in the functions). Please reduce the length of the messages or functions.", 'type': 'invalid_request_error', 'param': 'messages', 
'code': 'context_length_exceeded'}

Demonstrating the bug fix (this PR)

With this fix, you'll instead get a warning:

💭 Clear point of entry: Chad's first login. Should make him feel welcome. Initiating greeting protocol optimized for user-specific interaction.
🤖 Hello Chad! Welcome, it's our first official interaction. I hope we'll have some enlightening and engrossing conversations. How are you today?
⚡🟢 [function] Success: None
last response total_tokens (2647) < 6144.0
LocalStateManager.append_to_messages
> Enter your message: try using http_request
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
response = {'id': 'chatcmpl-8SI09zrGfhquV4oS92PhbNweArerr', 'object': 'chat.completion', 'created': 1701753073, 'model': 'gpt-4-0613', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'HTTP 
request function to be utilized. According to user instruction, will test the connection and return response. To demonstrate, will fetch data from a user-friendly public API.', 'function_call': {'name': 
'http_request', 'arguments': '{\n  "method": "GET",\n  "url": "https://api.publicapis.org/entries",\n  "request_heartbeat": true\n}'}}, 'finish_reason': 'function_call'}], 'usage': {'prompt_tokens': 2726, 
'completion_tokens': 71, 'total_tokens': 2797}, 'system_fingerprint': None}
💭 HTTP request function to be utilized. According to user instruction, will test the connection and return response. To demonstrate, will fetch data from a user-friendly public API.
Warning: function return was over limit (200048 > 2000) and was truncated
⚡🟢 [function] Success: {"status_code": 400, "headers": null, "body": "is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day 
along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide 
path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a 
lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along
the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is
was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day
along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide 
path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a 
lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along
the wide path is was a lonely day along the wide path is was a lonely day along the wid...
last response total_tokens (2797) < 6144.0
LocalStateManager.append_to_messages
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
response = {'id': 'chatcmpl-8SI0DImVIF9jOw6l3xLZxZrTk6RvE', 'object': 'chat.completion', 'created': 1701753077, 'model': 'gpt-4-0613', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'HTTP 
request executed. Result points to a 400 status code, meaning there was some syntactical error. Will respond to Chad accordingly.', 'function_call': {'name': 'send_message', 'arguments': '{\n  "message": "I attempted
to perform an HTTP request. However, it appears there was an error as the server has returned a 400 status code. This is typically due to a syntactical error in the request. We could try a different request if you 
want, Chad."\n}'}}, 'finish_reason': 'function_call'}], 'usage': {'prompt_tokens': 3330, 'completion_tokens': 93, 'total_tokens': 3423}, 'system_fingerprint': None}
💭 HTTP request executed. Result points to a 400 status code, meaning there was some syntactical error. Will respond to Chad accordingly.
🤖 I attempted to perform an HTTP request. However, it appears there was an error as the server has returned a 400 status code. This is typically due to a syntactical error in the request. We could try a different 
request if you want, Chad.
⚡🟢 [function] Success: None
last response total_tokens (3423) < 6144.0

Have you tested this PR?

Yes, see above.

@cpacker cpacker marked this pull request as draft December 4, 2023 22:38
@cpacker cpacker mentioned this pull request Dec 5, 2023
@cpacker cpacker changed the title [Draft] Add safeguard on tokens returned by functions Add safeguard on tokens returned by functions Dec 13, 2023
@cpacker cpacker marked this pull request as ready for review December 13, 2023 08:43
@@ -518,7 +518,8 @@ def handle_ai_response(self, response_message):
self.interface.function_message(f"Running {function_name}({function_args})")
try:
function_args["self"] = self # need to attach self to arg since it's dynamically linked
function_response_string = function_to_call(**function_args)
function_response = function_to_call(**function_args)
function_response_string = validate_function_response(function_response)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now validating function outputs before putting them into a function response message (in agent.messages)

@@ -712,7 +713,13 @@ def summarize_messages_inplace(self, cutoff=None, preserve_last_N_messages=True)
pass

message_sequence_to_summarize = self.messages[1:cutoff] # do NOT get rid of the system message
printd(f"Attempting to summarize {len(message_sequence_to_summarize)} messages [1:{cutoff}] of {len(self.messages)}")
if len(message_sequence_to_summarize) == 1:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an additional catch to prevent infinite summarization of a single message (I noticed this sort of loop happening when I ran the http_request overflow example).

@@ -55,6 +55,9 @@
CORE_MEMORY_PERSONA_CHAR_LIMIT = 2000
CORE_MEMORY_HUMAN_CHAR_LIMIT = 2000

# Function return limits
FUNCTION_RETURN_CHAR_LIMIT = 2000
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a few ways to cap the message output (could compute based on some fraction % of agent.context_window), but I figured might as well do something simpler like set it to a max char length similar since we're already doing this for human/persona. There's a chance the math can still go wrong ("go wrong" = enter some sort of "summarize deadlock" with a 2000 long truncated http request result) if the context window is too low, but with 8k I don't think we should have this problem. Also we make this assumption for persona/human anyways and it's working fine.

raise ValueError(function_response_string)

else:
if strict:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many functions (user-made etc) will return non-strings like dicts, so we should leave strict=False and attempt to cast as a string instead of failing.

memgpt/utils.py Outdated Show resolved Hide resolved
@cpacker cpacker added the priority Merge ASAP label Dec 13, 2023
@cpacker
Copy link
Collaborator Author

cpacker commented Dec 14, 2023

Tested after adding more informative truncation method:

💭 Attempting HTTP request. First, let's initialize some constants for the request.
Warning: function return was over limit (200048 > 2000) and was truncated
⚡🟢 [function] Success: {"status_code": 400, "headers": null, "body": "is was a lonely day along the wide path is was 
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wid... [NOTE: 
function output was truncated since it exceeded the character limit (200048 > 2000)]

@cpacker cpacker merged commit 8f178e1 into main Dec 14, 2023
2 checks passed
cpacker added a commit that referenced this pull request Dec 15, 2023
* updated local APIs to return usage info (#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (#576)

* swapping out hardcoded str for prefix (forgot to include in #569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (#617)

* allow passing custom host to uvicorn (#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
@cpacker cpacker deleted the function-token-safety branch December 15, 2023 19:00
@cpacker cpacker restored the function-token-safety branch December 15, 2023 19:00
@cpacker cpacker deleted the function-token-safety branch December 15, 2023 19:01
sarahwooders pushed a commit that referenced this pull request Dec 26, 2023
* swapping out hardcoded str for prefix (forgot to include in #569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative
goetzrobin added a commit to goetzrobin/MemGPT that referenced this pull request Jan 11, 2024
* updated local APIs to return usage info (letta-ai#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (letta-ai#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (letta-ai#576)

* swapping out hardcoded str for prefix (forgot to include in letta-ai#569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (letta-ai#617)

* allow passing custom host to uvicorn (letta-ai#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
norton120 pushed a commit to norton120/MemGPT that referenced this pull request Feb 15, 2024
* swapping out hardcoded str for prefix (forgot to include in letta-ai#569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative
norton120 pushed a commit to norton120/MemGPT that referenced this pull request Feb 15, 2024
* updated local APIs to return usage info (letta-ai#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (letta-ai#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (letta-ai#576)

* swapping out hardcoded str for prefix (forgot to include in letta-ai#569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (letta-ai#617)

* allow passing custom host to uvicorn (letta-ai#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* swapping out hardcoded str for prefix (forgot to include in #569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* updated local APIs to return usage info (#585)

* updated APIs to return usage info

* tested all endpoints

* added autogen as an extra (#616)

* added autogen as an extra

* updated docs

Co-authored-by: hemanthsavasere <[email protected]>

* Update LICENSE

* Add safeguard on tokens returned by functions (#576)

* swapping out hardcoded str for prefix (forgot to include in #569)

* add extra failout when the summarizer tries to run on a single message

* added function response validation code, currently will truncate responses based on character count

* added return type hints (functions/tools should either return strings or None)

* discuss function output length in custom function section

* made the truncation more informative

* patch bug where None.copy() throws runtime error (#617)

* allow passing custom host to uvicorn (#618)

* feat: initial poc for socket server

* feat: initial poc for frontend based on react

Set up an nx workspace which maks it easy to manage dependencies and added shadcn components
that allow us to build good-looking ui in a fairly simple way.
UI is a very simple and basic chat that starts with a message of the user and then simply displays the
answer string that is sent back from the fastapi ws endpoint

* feat: mapp arguments to json and return new messages

Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them.

* feat: display messages based on role and show inner thoughts and connection status

* chore: build newest frontend

* feat(frontend): show loader while waiting for first message and disable send button until connection is open

* feat: make agent send the first message and loop similar to CLI

currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server

* chore: build new version of frontend

* fix: rename lib directory so it is not excluded as part of python gitignore

* chore: rebuild frontend app

* fix: save agent at end of each response to allow the conversation to carry on over multiple sessions

* feat: restructure server to support multiple endpoints and add agents and sources endpoint

* feat: setup frontend routing and settings page

* chore: build frontend

* feat: another iteration of web interface

changes include: websocket for chat. switching between different agents. introduction of zustand state management

* feat: adjust frontend to work with memgpt rest-api

* feat: adjust existing rest_api to serve and interact with frontend

* feat: build latest frontend

* chore: build latest frontend

* fix: cleanup workspace

---------

Co-authored-by: Charles Packer <[email protected]>
Co-authored-by: hemanthsavasere <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority Merge ASAP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lack of safeguard on tokens returned by external functions
1 participant