Add safeguard on tokens returned by functions #576

cpacker · 2023-12-04T22:37:56Z

Closes #553

Please describe the purpose of this pull request

We currently do not check the size of function responses to make sure they are under X tokens.

This is not a big deal for the base function set, but it a problem for custom functions (eg the HTTP request in extras) which can return huge responses. This will trigger a summarization lock event and seems to have the potential to "brick" agents by nuking their message state.

Should patch by having a max function response length based on the context length, and running a smart string clipper / ellipses on the returns
Should also update the custom functions documentation to discuss manually managing string/dict return sizes based on context length

How to test

Modify http_request to return an arbitrarily long response:

def http_request(self, method: str, url: str, payload_json: Optional[str] = None):
    """
    Generates an HTTP request and returns the response.

    Args:
        method (str): The HTTP method (e.g., 'GET', 'POST').
        url (str): The URL for the request.
        payload_json (Optional[str]): A JSON string representing the request payload.

    Returns:
        dict: The response from the HTTP request.
    """

    return {"status_code": 400, "headers": None, "body": " ".join(["is was a lonely day along the wide path"] * 5000)}

Side note: if you use something like " ".join(["TEST"] * 5000) you'll get a 400 error from OpenAI because it identifies "TEST" as a specific bad request input

Demonstrating the bug

On main, if you override a function call like the HTTP requests call to return an arbitrary large text file, it'll brick the agent by causing a summarize chain that cannot complete:

💭 User just logged in. He's never interacted with this persona before. Let's greet him.
🤖 Greetings, Chad! Delighted to make your virtual acquaintance. You must have fascinating stories to share. What has sparked your curiosity recently?
⚡🟢 [function] Success: None
last response total_tokens (2637) < 6144.0
LocalStateManager.append_to_messages
> Enter your message: Can you try using the http_request function?
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
response = {'id': '...', 'object': 'chat.completion', 'created': 1701730353, 'model': 'gpt-4-0613', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "User wants 
me to make an HTTP request. I don't have any specific URL or service in mind, perhaps I could fetch a piece of interesting information from an API for testing.", 'function_call': {'name': 'http_request', 'arguments':
'{\n  "method": "GET",\n  "url": "https://api.publicapis.org/entries",\n  "request_heartbeat": true\n}'}}, 'finish_reason': 'function_call'}], 'usage': {'prompt_tokens': 2719, 'completion_tokens': 74, 'total_tokens':
2793}, 'system_fingerprint': None}
💭 User wants me to make an HTTP request. I don't have any specific URL or service in mind, perhaps I could fetch a piece of interesting information from an API for testing.
⚡🟢 [function] Success: {'status_code': 400, 'headers': None, 'body': 'is was a lonely day along the w....

...

last response total_tokens (2793) < 6144.0
LocalStateManager.append_to_messages
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
Got HTTPError, exception=400 Client Error: Bad Request for url: https://api.openai.com/v1/chat/completions, payload={'model': 'gpt-4', 'messages': [{'role': 'system', 'content': 'You are MemGPT, ...
...
 response={'error': {'message': "This model's maximum context length is 8192 
tokens. However, your messages resulted in 47817 tokens (46787 in the messages, 1030 in the functions). Please reduce the length of the messages or functions.", 'type': 'invalid_request_error', 'param': 'messages', 
'code': 'context_length_exceeded'}

Demonstrating the bug fix (this PR)

With this fix, you'll instead get a warning:

💭 Clear point of entry: Chad's first login. Should make him feel welcome. Initiating greeting protocol optimized for user-specific interaction.
🤖 Hello Chad! Welcome, it's our first official interaction. I hope we'll have some enlightening and engrossing conversations. How are you today?
⚡🟢 [function] Success: None
last response total_tokens (2647) < 6144.0
LocalStateManager.append_to_messages
> Enter your message: try using http_request
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
response = {'id': 'chatcmpl-8SI09zrGfhquV4oS92PhbNweArerr', 'object': 'chat.completion', 'created': 1701753073, 'model': 'gpt-4-0613', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'HTTP 
request function to be utilized. According to user instruction, will test the connection and return response. To demonstrate, will fetch data from a user-friendly public API.', 'function_call': {'name': 
'http_request', 'arguments': '{\n  "method": "GET",\n  "url": "https://api.publicapis.org/entries",\n  "request_heartbeat": true\n}'}}, 'finish_reason': 'function_call'}], 'usage': {'prompt_tokens': 2726, 
'completion_tokens': 71, 'total_tokens': 2797}, 'system_fingerprint': None}
💭 HTTP request function to be utilized. According to user instruction, will test the connection and return response. To demonstrate, will fetch data from a user-friendly public API.
Warning: function return was over limit (200048 > 2000) and was truncated
⚡🟢 [function] Success: {"status_code": 400, "headers": null, "body": "is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day 
along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide 
path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a 
lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along
the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is
was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day
along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide 
path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a 
lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along
the wide path is was a lonely day along the wide path is was a lonely day along the wid...
last response total_tokens (2797) < 6144.0
LocalStateManager.append_to_messages
Using model openai, endpoint: https://api.openai.com/v1
Sending request to https://api.openai.com/v1/chat/completions
response = {'id': 'chatcmpl-8SI0DImVIF9jOw6l3xLZxZrTk6RvE', 'object': 'chat.completion', 'created': 1701753077, 'model': 'gpt-4-0613', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'HTTP 
request executed. Result points to a 400 status code, meaning there was some syntactical error. Will respond to Chad accordingly.', 'function_call': {'name': 'send_message', 'arguments': '{\n  "message": "I attempted
to perform an HTTP request. However, it appears there was an error as the server has returned a 400 status code. This is typically due to a syntactical error in the request. We could try a different request if you 
want, Chad."\n}'}}, 'finish_reason': 'function_call'}], 'usage': {'prompt_tokens': 3330, 'completion_tokens': 93, 'total_tokens': 3423}, 'system_fingerprint': None}
💭 HTTP request executed. Result points to a 400 status code, meaning there was some syntactical error. Will respond to Chad accordingly.
🤖 I attempted to perform an HTTP request. However, it appears there was an error as the server has returned a 400 status code. This is typically due to a syntactical error in the request. We could try a different 
request if you want, Chad.
⚡🟢 [function] Success: None
last response total_tokens (3423) < 6144.0

Have you tested this PR?

Yes, see above.

…onses based on character count

… or None)

cpacker · 2023-12-13T08:50:30Z

memgpt/agent.py

@@ -518,7 +518,8 @@ def handle_ai_response(self, response_message):
            self.interface.function_message(f"Running {function_name}({function_args})")
            try:
                function_args["self"] = self  # need to attach self to arg since it's dynamically linked
-                function_response_string = function_to_call(**function_args)
+                function_response = function_to_call(**function_args)
+                function_response_string = validate_function_response(function_response)


We are now validating function outputs before putting them into a function response message (in agent.messages)

cpacker · 2023-12-13T08:51:09Z

memgpt/agent.py

@@ -712,7 +713,13 @@ def summarize_messages_inplace(self, cutoff=None, preserve_last_N_messages=True)
            pass

        message_sequence_to_summarize = self.messages[1:cutoff]  # do NOT get rid of the system message
-        printd(f"Attempting to summarize {len(message_sequence_to_summarize)} messages [1:{cutoff}] of {len(self.messages)}")
+        if len(message_sequence_to_summarize) == 1:


This is an additional catch to prevent infinite summarization of a single message (I noticed this sort of loop happening when I ran the http_request overflow example).

cpacker · 2023-12-13T08:53:13Z

memgpt/constants.py

@@ -55,6 +55,9 @@
 CORE_MEMORY_PERSONA_CHAR_LIMIT = 2000
 CORE_MEMORY_HUMAN_CHAR_LIMIT = 2000

+# Function return limits
+FUNCTION_RETURN_CHAR_LIMIT = 2000


There were a few ways to cap the message output (could compute based on some fraction % of agent.context_window), but I figured might as well do something simpler like set it to a max char length similar since we're already doing this for human/persona. There's a chance the math can still go wrong ("go wrong" = enter some sort of "summarize deadlock" with a 2000 long truncated http request result) if the context window is too low, but with 8k I don't think we should have this problem. Also we make this assumption for persona/human anyways and it's working fine.

cpacker · 2023-12-13T08:54:31Z

memgpt/utils.py

+                raise ValueError(function_response_string)
+
+        else:
+            if strict:


Many functions (user-made etc) will return non-strings like dicts, so we should leave strict=False and attempt to cast as a string instead of failing.

memgpt/utils.py

cpacker · 2023-12-14T05:49:39Z

Tested after adding more informative truncation method:

💭 Attempting HTTP request. First, let's initialize some constants for the request.
Warning: function return was over limit (200048 > 2000) and was truncated
⚡🟢 [function] Success: {"status_code": 400, "headers": null, "body": "is was a lonely day along the wide path is was 
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wide path is was
a lonely day along the wide path is was a lonely day along the wide path is was a lonely day along the wid... [NOTE: 
function output was truncated since it exceeded the character limit (200048 > 2000)]

* updated local APIs to return usage info (#585) * updated APIs to return usage info * tested all endpoints * added autogen as an extra (#616) * added autogen as an extra * updated docs Co-authored-by: hemanthsavasere <[email protected]> * Update LICENSE * Add safeguard on tokens returned by functions (#576) * swapping out hardcoded str for prefix (forgot to include in #569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative * patch bug where None.copy() throws runtime error (#617) * allow passing custom host to uvicorn (#618) * feat: initial poc for socket server * feat: initial poc for frontend based on react Set up an nx workspace which maks it easy to manage dependencies and added shadcn components that allow us to build good-looking ui in a fairly simple way. UI is a very simple and basic chat that starts with a message of the user and then simply displays the answer string that is sent back from the fastapi ws endpoint * feat: mapp arguments to json and return new messages Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them. * feat: display messages based on role and show inner thoughts and connection status * chore: build newest frontend * feat(frontend): show loader while waiting for first message and disable send button until connection is open * feat: make agent send the first message and loop similar to CLI currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server * chore: build new version of frontend * fix: rename lib directory so it is not excluded as part of python gitignore * chore: rebuild frontend app * fix: save agent at end of each response to allow the conversation to carry on over multiple sessions * feat: restructure server to support multiple endpoints and add agents and sources endpoint * feat: setup frontend routing and settings page * chore: build frontend * feat: another iteration of web interface changes include: websocket for chat. switching between different agents. introduction of zustand state management * feat: adjust frontend to work with memgpt rest-api * feat: adjust existing rest_api to serve and interact with frontend * feat: build latest frontend * chore: build latest frontend * fix: cleanup workspace --------- Co-authored-by: Charles Packer <[email protected]> Co-authored-by: hemanthsavasere <[email protected]>

* swapping out hardcoded str for prefix (forgot to include in #569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative

* updated local APIs to return usage info (letta-ai#585) * updated APIs to return usage info * tested all endpoints * added autogen as an extra (letta-ai#616) * added autogen as an extra * updated docs Co-authored-by: hemanthsavasere <[email protected]> * Update LICENSE * Add safeguard on tokens returned by functions (letta-ai#576) * swapping out hardcoded str for prefix (forgot to include in letta-ai#569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative * patch bug where None.copy() throws runtime error (letta-ai#617) * allow passing custom host to uvicorn (letta-ai#618) * feat: initial poc for socket server * feat: initial poc for frontend based on react Set up an nx workspace which maks it easy to manage dependencies and added shadcn components that allow us to build good-looking ui in a fairly simple way. UI is a very simple and basic chat that starts with a message of the user and then simply displays the answer string that is sent back from the fastapi ws endpoint * feat: mapp arguments to json and return new messages Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them. * feat: display messages based on role and show inner thoughts and connection status * chore: build newest frontend * feat(frontend): show loader while waiting for first message and disable send button until connection is open * feat: make agent send the first message and loop similar to CLI currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server * chore: build new version of frontend * fix: rename lib directory so it is not excluded as part of python gitignore * chore: rebuild frontend app * fix: save agent at end of each response to allow the conversation to carry on over multiple sessions * feat: restructure server to support multiple endpoints and add agents and sources endpoint * feat: setup frontend routing and settings page * chore: build frontend * feat: another iteration of web interface changes include: websocket for chat. switching between different agents. introduction of zustand state management * feat: adjust frontend to work with memgpt rest-api * feat: adjust existing rest_api to serve and interact with frontend * feat: build latest frontend * chore: build latest frontend * fix: cleanup workspace --------- Co-authored-by: Charles Packer <[email protected]> Co-authored-by: hemanthsavasere <[email protected]>

* swapping out hardcoded str for prefix (forgot to include in letta-ai#569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative

* updated local APIs to return usage info (letta-ai#585) * updated APIs to return usage info * tested all endpoints * added autogen as an extra (letta-ai#616) * added autogen as an extra * updated docs Co-authored-by: hemanthsavasere <[email protected]> * Update LICENSE * Add safeguard on tokens returned by functions (letta-ai#576) * swapping out hardcoded str for prefix (forgot to include in letta-ai#569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative * patch bug where None.copy() throws runtime error (letta-ai#617) * allow passing custom host to uvicorn (letta-ai#618) * feat: initial poc for socket server * feat: initial poc for frontend based on react Set up an nx workspace which maks it easy to manage dependencies and added shadcn components that allow us to build good-looking ui in a fairly simple way. UI is a very simple and basic chat that starts with a message of the user and then simply displays the answer string that is sent back from the fastapi ws endpoint * feat: mapp arguments to json and return new messages Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them. * feat: display messages based on role and show inner thoughts and connection status * chore: build newest frontend * feat(frontend): show loader while waiting for first message and disable send button until connection is open * feat: make agent send the first message and loop similar to CLI currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server * chore: build new version of frontend * fix: rename lib directory so it is not excluded as part of python gitignore * chore: rebuild frontend app * fix: save agent at end of each response to allow the conversation to carry on over multiple sessions * feat: restructure server to support multiple endpoints and add agents and sources endpoint * feat: setup frontend routing and settings page * chore: build frontend * feat: another iteration of web interface changes include: websocket for chat. switching between different agents. introduction of zustand state management * feat: adjust frontend to work with memgpt rest-api * feat: adjust existing rest_api to serve and interact with frontend * feat: build latest frontend * chore: build latest frontend * fix: cleanup workspace --------- Co-authored-by: Charles Packer <[email protected]> Co-authored-by: hemanthsavasere <[email protected]>

* swapping out hardcoded str for prefix (forgot to include in #569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative

* updated local APIs to return usage info (#585) * updated APIs to return usage info * tested all endpoints * added autogen as an extra (#616) * added autogen as an extra * updated docs Co-authored-by: hemanthsavasere <[email protected]> * Update LICENSE * Add safeguard on tokens returned by functions (#576) * swapping out hardcoded str for prefix (forgot to include in #569) * add extra failout when the summarizer tries to run on a single message * added function response validation code, currently will truncate responses based on character count * added return type hints (functions/tools should either return strings or None) * discuss function output length in custom function section * made the truncation more informative * patch bug where None.copy() throws runtime error (#617) * allow passing custom host to uvicorn (#618) * feat: initial poc for socket server * feat: initial poc for frontend based on react Set up an nx workspace which maks it easy to manage dependencies and added shadcn components that allow us to build good-looking ui in a fairly simple way. UI is a very simple and basic chat that starts with a message of the user and then simply displays the answer string that is sent back from the fastapi ws endpoint * feat: mapp arguments to json and return new messages Except for the previous user message we return all newly generated messages and let the frontend figure out how to display them. * feat: display messages based on role and show inner thoughts and connection status * chore: build newest frontend * feat(frontend): show loader while waiting for first message and disable send button until connection is open * feat: make agent send the first message and loop similar to CLI currently the CLI loops until the correct function call sends a message to the user. this is an initial try to achieve a similar behavior in the socket server * chore: build new version of frontend * fix: rename lib directory so it is not excluded as part of python gitignore * chore: rebuild frontend app * fix: save agent at end of each response to allow the conversation to carry on over multiple sessions * feat: restructure server to support multiple endpoints and add agents and sources endpoint * feat: setup frontend routing and settings page * chore: build frontend * feat: another iteration of web interface changes include: websocket for chat. switching between different agents. introduction of zustand state management * feat: adjust frontend to work with memgpt rest-api * feat: adjust existing rest_api to serve and interact with frontend * feat: build latest frontend * chore: build latest frontend * fix: cleanup workspace --------- Co-authored-by: Charles Packer <[email protected]> Co-authored-by: hemanthsavasere <[email protected]>

swapping out hardcoded str for prefix (forgot to include in #569)

ff6c84f

cpacker marked this pull request as draft December 4, 2023 22:38

add extra failout when the summarizer tries to run on a single message

a12f9a2

cpacker mentioned this pull request Dec 5, 2023

Bump version to 0.2.6 #573

Merged

cpacker added 3 commits December 4, 2023 21:15

added function response validation code, currently will truncate resp…

a768a02

…onses based on character count

added return type hints (functions/tools should either return strings…

4886319

… or None)

Merge branch 'main' into function-token-safety

f69dc77

cpacker changed the title ~~[Draft] Add safeguard on tokens returned by functions~~ Add safeguard on tokens returned by functions Dec 13, 2023

cpacker marked this pull request as ready for review December 13, 2023 08:43

cpacker requested a review from sarahwooders December 13, 2023 08:43

discuss function output length in custom function section

e0a2e40

cpacker commented Dec 13, 2023

View reviewed changes

memgpt/utils.py Outdated Show resolved Hide resolved

cpacker added the priority Merge ASAP label Dec 13, 2023

made the truncation more informative

ee9b4a1

cpacker merged commit 8f178e1 into main Dec 14, 2023
2 checks passed

cpacker deleted the function-token-safety branch December 15, 2023 19:00

cpacker restored the function-token-safety branch December 15, 2023 19:00

cpacker deleted the function-token-safety branch December 15, 2023 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add safeguard on tokens returned by functions #576

Add safeguard on tokens returned by functions #576

cpacker commented Dec 4, 2023 •

edited

Loading

cpacker Dec 13, 2023

cpacker Dec 13, 2023

cpacker Dec 13, 2023

cpacker Dec 13, 2023

cpacker commented Dec 14, 2023

Add safeguard on tokens returned by functions #576

Add safeguard on tokens returned by functions #576

Conversation

cpacker commented Dec 4, 2023 • edited Loading

Please describe the purpose of this pull request

How to test

Demonstrating the bug

Demonstrating the bug fix (this PR)

Have you tested this PR?

cpacker Dec 13, 2023

Choose a reason for hiding this comment

cpacker Dec 13, 2023

Choose a reason for hiding this comment

cpacker Dec 13, 2023

Choose a reason for hiding this comment

cpacker Dec 13, 2023

Choose a reason for hiding this comment

cpacker commented Dec 14, 2023

cpacker commented Dec 4, 2023 •

edited

Loading